Dirichlet Process Parsimonious Mixtures for clustering
Faicel Chamroukhi, Marius Bartcus, Herv\'e Glotin

TL;DR
This paper introduces Dirichlet Process Parsimonious Mixtures (DPPM), a Bayesian nonparametric approach that automatically determines the number of clusters and the structure of Gaussian mixtures, improving clustering flexibility and effectiveness.
Contribution
It develops a novel Bayesian nonparametric framework for parsimonious Gaussian mixture models, enabling automatic inference of the number of components and model structure from data.
Findings
DPPM models outperform standard parsimonious models in clustering tasks.
The Gibbs sampling approach effectively estimates model parameters.
Bayesian model selection via Bayes factors validates the models' adaptability.
Abstract
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Data-Driven Disease Surveillance
