Geometric Dirichlet Means algorithm for topic inference
Mikhail Yurochkin, XuanLong Nguyen

TL;DR
This paper introduces a geometric algorithm for topic inference in LDA models that is faster and more accurate than traditional methods, with proven statistical consistency and extensive experimental validation.
Contribution
A novel geometric clustering algorithm for topic inference that improves computational efficiency and accuracy over existing methods like Gibbs sampling and variational inference.
Findings
Achieves comparable accuracy to Gibbs sampling
Overcomes computational inefficiencies of existing methods
Proven statistical consistency under certain conditions
Abstract
We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Genetic and phenotypic traits in livestock
