Identifiability of Nonparametric Mixture Models and Bayes Optimal Clustering
Bryon Aragam, Chen Dan, Eric P. Xing, Pradeep Ravikumar

TL;DR
This paper develops a general framework for the identifiability of nonparametric mixture models, enabling Bayes optimal clustering with theoretical guarantees and practical algorithms, applicable to complex real-world data.
Contribution
It introduces new identifiability conditions for nonparametric mixtures using overfitted parametric models, extending classical clustering theory to nonparametric settings.
Findings
Established conditions for nonparametric mixture identifiability.
Generalized Bayes optimal clustering to nonparametric models.
Provided a practical algorithm with consistency guarantees.
Abstract
Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable, by introducing a novel framework involving clustering overfitted \emph{parametric} (i.e. misspecified) mixture models. These identifiability conditions generalize existing conditions in the literature, and are flexible enough to include for example mixtures of Gaussian mixtures. In contrast to the recent literature on estimating nonparametric mixtures, we allow for general nonparametric mixture components, and instead impose regularity assumptions on the underlying mixing measure. As our primary application, we apply these results to partition-based clustering, generalizing the notion of a Bayes optimal partition from classical parametric model-based clustering to nonparametric settings. Furthermore, this framework is constructive so that it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
