A Practical Algorithm for Topic Modeling with Provable Guarantees
Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David, Sontag, Yichen Wu, Michael Zhu

TL;DR
This paper introduces a practical and efficient algorithm for topic modeling that offers provable guarantees, achieving comparable results to existing methods like MCMC but with significantly improved speed.
Contribution
The authors develop a new algorithm for topic inference that combines provable theoretical guarantees with practical efficiency and robustness.
Findings
Achieves comparable accuracy to MCMC methods
Runs orders of magnitude faster than existing algorithms
Provides provable bounds on inference quality
Abstract
Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for topic model inference that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Text and Document Classification Technologies
