Analysis of the maximal posterior partition in the Dirichlet Process Gaussian Mixture Model
{\L}ukasz Rajkowski

TL;DR
This paper analyzes the properties of the MAP clustering in Gaussian Dirichlet process mixture models, revealing geometric and asymptotic characteristics that influence the estimated number of clusters.
Contribution
It provides theoretical insights into the structure and asymptotic behavior of the MAP partition in Gaussian DPMMs, including bounds on the number of clusters.
Findings
Almost disjoint convex hulls of clusters.
Bounded number of clusters intersecting a fixed region.
Asymptotic maximization of a simple function depending on covariance.
Abstract
Mixture models are a natural choice in many applications, but it can be difficult to place an a priori upper bound on the number of components. To circumvent this, investigators are turning increasingly to Dirichlet process mixture models (DPMMs). It is therefore important to develop an understanding of the strengths and weaknesses of this approach. This work considers the MAP (maximum a posteriori) clustering for the Gaussian DPMM (where the cluster means have Gaussian distribution and, for each cluster, the observations within the cluster have Gaussian distribution). Some desirable properties of the MAP partition are proved: `almost disjointness' of the convex hulls of clusters (they may have at most one point in common) and (with natural assumptions) the comparability of sizes of those clusters that intersect any fixed ball with the number of observations (as the latter goes to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Census and Population Estimation
