Modeling Documents with Deep Boltzmann Machines
Nitish Srivastava, Ruslan R Salakhutdinov, Geoffrey E. Hinton

TL;DR
This paper presents a Deep Boltzmann Machine model for extracting semantic features from large document collections, demonstrating improved performance over existing models in document retrieval and classification.
Contribution
The authors introduce a novel training approach for Deep Boltzmann Machines with parameter tying, enabling efficient learning and superior document modeling.
Findings
Model assigns better log probability to unseen data than Replicated Softmax.
Features outperform LDA, Replicated Softmax, and DocNADE in retrieval and classification.
Efficient training comparable to standard RBMs.
Abstract
We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just as efficiently as a standard Restricted Boltzmann Machine. Our experiments show that the model assigns better log probability to unseen data than the Replicated Softmax model. Features extracted from our model outperform LDA, Replicated Softmax, and DocNADE models on document retrieval and document classification tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Music and Audio Processing
MethodsLinear Discriminant Analysis · Deep Boltzmann Machine · Softmax
