Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean

TL;DR
This paper introduces two new models for efficiently learning high-quality word vectors from large datasets, achieving state-of-the-art results with significantly reduced computational resources.
Contribution
The paper presents novel model architectures that outperform previous neural network-based methods in both accuracy and efficiency for word representation learning.
Findings
Achieves high-quality word vectors in less than a day on 1.6 billion words
Outperforms previous methods in word similarity tasks
Provides state-of-the-art syntactic and semantic similarity results
Abstract
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Attention in transformers, step-by-step | Deep Learning Chapter 6· youtube
Transformers, the tech behind LLMs | Deep Learning Chapter 5· youtube
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsSkip-gram Word2Vec · Continuous Bag-of-Words Word2Vec
