Deconstructing word embedding algorithms
Kian Kenyon-Dean, Edward Newell, Jackie Chi Kit Cheung

TL;DR
This paper retrospectively analyzes popular word embedding algorithms like Word2vec and GloVe, revealing common conditions for their performance to guide future NLP model development.
Contribution
It deconstructs major word embedding algorithms into a unified framework, uncovering shared principles that underpin their effectiveness.
Findings
Identifies common conditions for effective word embeddings
Provides a unified theoretical framework for existing algorithms
Suggests directions for developing improved embeddings
Abstract
Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGloVe Embeddings
