Topology of Word Embeddings: Singularities Reflect Polysemy

Alexander Jakubowski; Milica Ga\v{s}i\'c; Marcus Zibrowius

arXiv:2011.09413·cs.CL·November 19, 2020·5 cites

Topology of Word Embeddings: Singularities Reflect Polysemy

Alexander Jakubowski, Milica Ga\v{s}i\'c, Marcus Zibrowius

PDF

Open Access

TL;DR

This paper explores the topological structure of word embeddings, revealing that polysemous words correspond to singular points on a manifold, and introduces topological measures to distinguish word meanings.

Contribution

It proposes a novel topological framework for understanding word embeddings, linking singularities to polysemy, and offers empirical methods for measuring and disambiguating word senses.

Findings

01

Topological measure correlates with number of word meanings

02

Singular points in embeddings indicate polysemy

03

Topologically motivated approach achieves competitive results in word sense disambiguation

Abstract

The manifold hypothesis suggests that word vectors live on a submanifold within their ambient vector space. We argue that we should, more accurately, expect them to live on a pinched manifold: a singular quotient of a manifold obtained by identifying some of its points. The identified, singular points correspond to polysemous words, i.e. words with multiple meanings. Our point of view suggests that monosemous and polysemous words can be distinguished based on the topology of their neighbourhoods. We present two kinds of empirical evidence to support this point of view: (1) We introduce a topological measure of polysemy based on persistent homology that correlates well with the actual number of meanings of a word. (2) We propose a simple, topologically motivated solution to the SemEval-2010 task on Word Sense Induction & Disambiguation that produces competitive results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Natural Language Processing Techniques · Topic Modeling