Modeling Documents with Deep Boltzmann Machines

Nitish Srivastava; Ruslan R Salakhutdinov; Geoffrey E. Hinton

arXiv:1309.6865·cs.LG·September 27, 2013·60 cites

Modeling Documents with Deep Boltzmann Machines

Nitish Srivastava, Ruslan R Salakhutdinov, Geoffrey E. Hinton

PDF

Open Access

TL;DR

This paper presents a Deep Boltzmann Machine model for extracting semantic features from large document collections, demonstrating improved performance over existing models in document retrieval and classification.

Contribution

The authors introduce a novel training approach for Deep Boltzmann Machines with parameter tying, enabling efficient learning and superior document modeling.

Findings

01

Model assigns better log probability to unseen data than Replicated Softmax.

02

Features outperform LDA, Replicated Softmax, and DocNADE in retrieval and classification.

03

Efficient training comparable to standard RBMs.

Abstract

We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just as efficiently as a standard Restricted Boltzmann Machine. Our experiments show that the model assigns better log probability to unseen data than the Replicated Softmax model. Features extracted from our model outperform LDA, Replicated Softmax, and DocNADE models on document retrieval and document classification tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Music and Audio Processing

MethodsLinear Discriminant Analysis · Deep Boltzmann Machine · Softmax