A sequential algorithm for fast fitting of Dirichlet process mixture models
David Nott, Xiaole Zhang, Chris Yau, Ajay Jasra

TL;DR
This paper introduces VSUGS, a variational Bayes extension of the SUGS algorithm, enabling faster and more accurate Bayesian inference for large Dirichlet process mixture models, with demonstrated improvements in density estimation and classification.
Contribution
It proposes a novel variational SUGS algorithm that softens data allocation, improving speed and accuracy over the original SUGS method for large-scale mixture modeling.
Findings
VSUGS outperforms SUGS in density estimation.
VSUGS provides better classification results.
Application to flow cytometry and SNP data shows improved modeling.
Abstract
In this article we propose an improvement on the sequential updating and greedy search (SUGS) algorithm Wang and Dunson for fast fitting of Dirichlet process mixture models. The SUGS algorithm provides a means for very fast approximate Bayesian inference for mixture data which is particularly of use when data sets are so large that many standard Markov chain Monte Carlo (MCMC) algorithms cannot be applied efficiently, or take a prohibitively long time to converge. In particular, these ideas are used to initially interrogate the data, and to refine models such that one can potentially apply exact data analysis later on. SUGS relies upon sequentially allocating data to clusters and proceeding with an update of the posterior on the subsequent allocations and parameters which assumes this allocation is correct. Our modification softens this approach, by providing a probability distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
