A simple example of Dirichlet process mixture inconsistency for the number of components
Jeffrey W. Miller, Matthew T. Harrison

TL;DR
This paper demonstrates that Dirichlet process mixtures can be inconsistent in estimating the true number of mixture components, even in simple cases, challenging their reliability for this purpose.
Contribution
It provides an elementary example showing severe inconsistency of DPMs in estimating the number of components, highlighting a fundamental limitation.
Findings
Posterior probability for one cluster goes to zero in the example.
DPMs can be inconsistent even with simple normal mixtures.
The inconsistency persists despite the simplicity of the setting.
Abstract
For data assumed to come from a finite mixture with an unknown number of components, it has become common to use Dirichlet process mixtures (DPMs) not only for density estimation, but also for inferences about the number of components. The typical approach is to use the posterior distribution on the number of components occurring so far --- that is, the posterior on the number of clusters in the observed data. However, it turns out that this posterior is not consistent --- it does not converge to the true number of components. In this note, we give an elementary demonstration of this inconsistency in what is perhaps the simplest possible setting: a DPM with normal components of unit variance, applied to data from a "mixture" with one standard normal component. Further, we find that this example exhibits severe inconsistency: instead of going to 1, the posterior probability that there is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
