Training Restricted Boltzmann Machines on Word Observations
George E. Dahl, Ryan P. Adams, Hugo Larochelle

TL;DR
This paper introduces a new training method for Restricted Boltzmann Machines that efficiently handles high-dimensional word data by using advanced MCMC operators, enabling training on large vocabularies and improving NLP task performance.
Contribution
The authors develop a novel MCMC approach that reduces computational complexity in training RBMs on word observations, allowing for larger vocabularies and better NLP task results.
Findings
Successfully trained RBMs on hundreds of millions of word n-grams
Achieved state-of-the-art results in sentiment classification
Improved NLP task performance using learned features
Abstract
The restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds of thousands. The conventional approach to training RBMs on word observations is limited because it requires sampling the states of K-way softmax visible units during block Gibbs updates, an operation that takes time linear in K. In this work, we address this issue by employing a more general class of Markov chain Monte Carlo operators on the visible units, yielding updates with computational complexity independent of K. We demonstrate the success of our approach by training RBMs on hundreds of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
