Model-based Clustering with Sparse Covariance Matrices
Michael Fop, Thomas Brendan Murphy, Luca Scrucca

TL;DR
This paper introduces a novel clustering method using mixtures of Gaussian covariance graph models that directly estimate sparse covariance matrices, allowing for flexible, parsimonious, and interpretable clustering of multivariate data.
Contribution
It develops a new framework for model-based clustering that estimates sparse covariance matrices with variable structures across clusters using a penalized likelihood and structural-EM algorithm.
Findings
Achieves good classification performance in simulations and real data
Provides a flexible approach for modeling variable associations within clusters
Enables direct estimation of sparse covariance matrices for interpretability
Abstract
Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily over-parameterized. For this reason, parsimonious models have been developed via covariance matrix decompositions or assuming local independence. However, these remedies do not allow for direct estimation of sparse covariance matrices nor do they take into account that the structure of association among the variables can vary from one cluster to the other. To this end, we introduce mixtures of Gaussian covariance graph models for model-based clustering with sparse covariance matrices. A penalized likelihood approach is employed for estimation and a general penalty term on the graph configurations can be used to induce different levels of sparsity and incorporate prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Bayesian Modeling and Causal Inference · Advanced Clustering Algorithms Research
