Advances in Pre-Training Distributed Word Representations
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch,, Armand Joulin

TL;DR
This paper introduces a new set of pre-trained word vector models that leverage combined techniques to significantly outperform existing models across various NLP tasks.
Contribution
It presents a novel combination of known methods for training high-quality word embeddings and releases new models that set a new performance benchmark.
Findings
New pre-trained models outperform previous state-of-the-art
Combining multiple known techniques improves embedding quality
Models achieve superior results on diverse NLP tasks
Abstract
Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl. In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together. The main result of our work is the new set of publicly available pre-trained models that outperform the current state of the art by a large margin on a number of tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
