Guaranteed inference in topic models

Khoat Than; Tung Doan

arXiv:1512.03308·stat.ML·August 18, 2016·5 cites

Guaranteed inference in topic models

Khoat Than, Tung Doan

PDF

Open Access 1 Repo

TL;DR

This paper introduces OPE, a provably fast and reliable algorithm for posterior inference in topic models, with applications to learning LDA from large text streams, outperforming existing methods.

Contribution

The paper presents OPE, a new inference algorithm with theoretical guarantees and fast convergence, applicable to various contexts and improving LDA learning from large datasets.

Findings

01

OPE converges quickly to a local stationary point.

02

OPE outperforms existing inference methods in experiments.

03

New LDA learning methods demonstrate superior performance.

Abstract

One of the core problems in statistical models is the estimation of a posterior distribution. For topic models, the problem of posterior inference for individual texts is particularly important, especially when dealing with data streams, but is often intractable in the worst case. As a consequence, existing methods for posterior inference are approximate and do not have any guarantee on neither quality nor convergence rate. In this paper, we introduce a provably fast algorithm, namely Online Maximum a Posteriori Estimation (OPE), for posterior inference in topic models. OPE has more attractive properties than existing inference approaches, including theoretical guarantees on quality and fast rate of convergence to a local maximal/stationary point of the inference problem. The discussions about OPE are very general and hence can be easily employed in a wide range of contexts. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Khoat/OPE
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Bayesian Methods and Mixture Models