Fast search for Dirichlet process mixture models
Hal Daum\'e III

TL;DR
This paper introduces efficient search algorithms for Dirichlet process mixture models that significantly reduce computational costs, enabling their application to large datasets and providing good initializations for MCMC methods.
Contribution
It proposes practical search-based inference methods for DP mixture models, offering a faster alternative to traditional MCMC and variational approaches, especially for large-scale data.
Findings
Search algorithms effectively find MAP assignments in DP models.
The methods enable application of DP models to very large datasets.
Search results can initialize MCMC for posterior sampling.
Abstract
Dirichlet process (DP) mixture models provide a flexible Bayesian framework for density estimation. Unfortunately, their flexibility comes at a cost: inference in DP mixture models is computationally expensive, even when conjugate distributions are used. In the common case when one seeks only a maximum a posteriori assignment of data points to clusters, we show that search algorithms provide a practical alternative to expensive MCMC and variational techniques. When a true posterior sample is desired, the solution found by search can serve as a good initializer for MCMC. Experimental results show that using these techniques is it possible to apply DP mixture models to very large data sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Algorithms and Data Compression
