Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
James Foulds, Levi Boyles, Christopher Dubois, Padhraic Smyth, Max, Welling

TL;DR
This paper introduces a stochastic collapsed variational Bayesian inference algorithm for LDA that is simpler, faster, and often yields better solutions than previous methods, enabling efficient large-scale and interactive topic modeling.
Contribution
It presents a novel stochastic collapsed variational Bayesian inference algorithm for LDA, improving efficiency and convergence over existing methods.
Findings
Faster convergence on large-scale corpora
Often finds better solutions than previous methods
Enables real-time topic modeling in interactive applications
Abstract
In the internet era there has been an explosion in the amount of digital text information available, leading to difficulties of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it feasible to learn topic models on large-scale corpora, but these methods do not currently take full advantage of the collapsed representation of the model. We propose a stochastic algorithm for collapsed variational Bayesian inference for LDA, which is simpler and more efficient than the state of the art method. We show connections between collapsed variational Bayesian inference and MAP estimation for LDA, and leverage these connections to prove convergence properties of the proposed algorithm. In experiments on large-scale text corpora, the algorithm was found to converge faster and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Bayesian Methods and Mixture Models · Natural Language Processing Techniques
MethodsLinear Discriminant Analysis
