Skip-gram word embeddings in hyperbolic space
Matthias Leimeister, Benjamin J. Wilson

TL;DR
This paper introduces a method for learning word embeddings in hyperbolic space using a modified skip-gram model, aiming to capture hierarchical and scale-free structures in language data.
Contribution
It develops a hyperbolic distance-based objective function integrated into skip-gram, extending word embeddings into hyperbolic space from free text.
Findings
Hyperbolic embeddings perform well on similarity and analogy tasks.
Low-dimensional hyperbolic embeddings show promising results.
No clear superiority over Euclidean embeddings in all cases.
Abstract
Recent work has demonstrated that embeddings of tree-like graphs in hyperbolic space surpass their Euclidean counterparts in performance by a large margin. Inspired by these results and scale-free structure in the word co-occurrence graph, we present an algorithm for learning word embeddings in hyperbolic space from free text. An objective function based on the hyperbolic distance is derived and included in the skip-gram negative-sampling architecture of word2vec. The hyperbolic word embeddings are then evaluated on word similarity and analogy benchmarks. The results demonstrate the potential of hyperbolic word embeddings, particularly in low dimensions, though without clear superiority over their Euclidean counterparts. We further discuss subtleties in the formulation of the analogy task in curved spaces.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
