Scalable Inference for Latent Dirichlet Allocation
James Petterson, Tiberio Caetano

TL;DR
This paper presents a scalable, asynchronous distributed inference method for Latent Dirichlet Allocation that balances speed and accuracy, suitable for heterogeneous computing clusters.
Contribution
It introduces a simple, tunable approximation method for distributed LDA inference that is asynchronous and adaptable to different hardware environments.
Findings
Efficient distributed inference with adjustable accuracy
Asynchronous approach suitable for heterogeneous clusters
Demonstrates scalability and flexibility in LDA learning
Abstract
We investigate the problem of learning a topic model - the well-known Latent Dirichlet Allocation - in a distributed manner, using a cluster of C processors and dividing the corpus to be learned equally among them. We propose a simple approximated method that can be tuned, trading speed for accuracy according to the task at hand. Our approach is asynchronous, and therefore suitable for clusters of heterogenous machines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Bayesian Methods and Mixture Models
