Guaranteed inference in topic models
Khoat Than, Tung Doan

TL;DR
This paper introduces OPE, a provably fast and reliable algorithm for posterior inference in topic models, with applications to learning LDA from large text streams, outperforming existing methods.
Contribution
The paper presents OPE, a new inference algorithm with theoretical guarantees and fast convergence, applicable to various contexts and improving LDA learning from large datasets.
Findings
OPE converges quickly to a local stationary point.
OPE outperforms existing inference methods in experiments.
New LDA learning methods demonstrate superior performance.
Abstract
One of the core problems in statistical models is the estimation of a posterior distribution. For topic models, the problem of posterior inference for individual texts is particularly important, especially when dealing with data streams, but is often intractable in the worst case. As a consequence, existing methods for posterior inference are approximate and do not have any guarantee on neither quality nor convergence rate. In this paper, we introduce a provably fast algorithm, namely Online Maximum a Posteriori Estimation (OPE), for posterior inference in topic models. OPE has more attractive properties than existing inference approaches, including theoretical guarantees on quality and fast rate of convergence to a local maximal/stationary point of the inference problem. The discussions about OPE are very general and hence can be easily employed in a wide range of contexts. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Bayesian Methods and Mixture Models
