Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
Dhruva Karkada, James B. Simon, Yasaman Bahri, Michael R. DeWeese

TL;DR
This paper analytically studies the training dynamics of word2vec-like models, revealing how they learn interpretable linear subspaces and semantic concepts, with solutions based on corpus statistics and hyperparameters.
Contribution
It provides an exact analytical solution for training dynamics and embeddings of word2vec-like models, linking them to corpus statistics and revealing their linear subspace learning process.
Findings
Models learn orthogonal linear subspaces incrementally.
Each subspace corresponds to an interpretable concept.
Linear semantic concepts emerge during training, enabling analogy completion.
Abstract
Self-supervised word embedding algorithms such as word2vec provide a minimal setting for studying representation learning in language modeling. We examine the quartic Taylor approximation of the word2vec loss around the origin, and we show that both the resulting training dynamics and the final performance on downstream tasks are empirically very similar to those of word2vec. Our main contribution is to analytically solve for both the gradient flow training dynamics and the final word embeddings in terms of only the corpus statistics and training hyperparameters. The solutions reveal that these models learn orthogonal linear subspaces one at a time, each one incrementing the effective rank of the embeddings until model capacity is saturated. Training on Wikipedia, we find that each of the top linear subspaces represents an interpretable topic-level concept. Finally, we apply our theory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage and cultural evolution
