Blocking Collapsed Gibbs Sampler for Latent Dirichlet Allocation Models
Xin Zhang, Scott A. Sisson

TL;DR
This paper introduces a blocking scheme for the collapsed Gibbs sampler in LDA models, significantly improving mixing efficiency and reducing computation time for large numbers of topics.
Contribution
The paper proposes a novel blocking scheme with theoretical guarantees that enhances sampling efficiency and computational speed in LDA inference.
Findings
Substantial improvement in chain mixing efficiency.
Significant reduction in computation time for models with many topics.
Effective sampling procedures within each block.
Abstract
The latent Dirichlet allocation (LDA) model is a widely-used latent variable model in machine learning for text analysis. Inference for this model typically involves a single-site collapsed Gibbs sampling step for latent variables associated with observations. The efficiency of the sampling is critical to the success of the model in practical large scale applications. In this article, we introduce a blocking scheme to the collapsed Gibbs sampler for the LDA model which can, with a theoretical guarantee, improve chain mixing efficiency. We develop two procedures, an O(K)-step backward simulation and an O(log K)-step nested simulation, to directly sample the latent variables within each block. We demonstrate that the blocking scheme achieves substantial improvements in chain mixing compared to the state of the art single-site collapsed Gibbs sampler. We also show that when the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Markov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference
MethodsLinear Discriminant Analysis
