A Fast and Simple Algorithm for Training Neural Probabilistic Language   Models

Andriy Mnih (University College London); Yee Whye Teh (University; College London)

arXiv:1206.6426·cs.CL·June 7, 2016·ICML·312 cites

A Fast and Simple Algorithm for Training Neural Probabilistic Language Models

Andriy Mnih (University College London), Yee Whye Teh (University, College London)

PDF

Open Access

TL;DR

This paper introduces a noise-contrastive estimation algorithm that significantly speeds up training of neural probabilistic language models, making them more practical without sacrificing performance.

Contribution

The paper presents a novel, efficient training algorithm for NPLMs based on noise-contrastive estimation, reducing training time and improving stability compared to previous methods.

Findings

01

Training time reduced by over an order of magnitude

02

Achieved state-of-the-art results on sentence completion

03

More stable and efficient than importance sampling

Abstract

In spite of their superior performance, neural probabilistic language models (NPLMs) remain far less widely used than n-gram models due to their notoriously long training times, which are measured in weeks even for moderately-sized datasets. Training NPLMs is computationally expensive because they are explicitly normalized, which leads to having to consider all words in the vocabulary when computing the log-likelihood gradients. We propose a fast and simple algorithm for training NPLMs based on noise-contrastive estimation, a newly introduced procedure for estimating unnormalized continuous distributions. We investigate the behaviour of the algorithm on the Penn Treebank corpus and show that it reduces the training times by more than an order of magnitude without affecting the quality of the resulting models. The algorithm is also more efficient and much more stable than importance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis