Advances in Pre-Training Distributed Word Representations

Tomas Mikolov; Edouard Grave; Piotr Bojanowski; Christian Puhrsch,; Armand Joulin

arXiv:1712.09405·cs.CL·December 29, 2017·330 cites

Advances in Pre-Training Distributed Word Representations

Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch,, Armand Joulin

PDF

Open Access 5 Repos

TL;DR

This paper introduces a new set of pre-trained word vector models that leverage combined techniques to significantly outperform existing models across various NLP tasks.

Contribution

It presents a novel combination of known methods for training high-quality word embeddings and releases new models that set a new performance benchmark.

Findings

01

New pre-trained models outperform previous state-of-the-art

02

Combining multiple known techniques improves embedding quality

03

Models achieve superior results on diverse NLP tasks

Abstract

Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl. In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together. The main result of our work is the new set of publicly available pre-trained models that outperform the current state of the art by a large margin on a number of tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques