Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables
Benhuai Xie, Wei Pan, Xiaotong Shen

TL;DR
This paper introduces a novel penalized model-based clustering method that allows for cluster-specific diagonal covariance matrices and grouped variable selection, improving clustering accuracy in high-dimensional, noisy data such as microarrays.
Contribution
It extends existing penalized likelihood clustering methods by incorporating cluster-specific covariance matrices and group variable selection, enhancing flexibility and practical applicability.
Findings
Effective in high-dimensional microarray data analysis
Improves variable selection accuracy with grouped penalties
Demonstrates advantages through leukemia subtype discovery
Abstract
Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying clustering structures. Hence removing noise variables via variable selection is necessary. For simultaneous variable selection and parameter estimation, existing penalized likelihood approaches in model-based clustering analysis all assume a common diagonal covariance matrix across clusters, which however may not hold in practice. To analyze high-dimensional data, particularly those with relatively low sample sizes, this article introduces a novel approach that shrinks the variances together with means, in a more general situation with cluster-specific (diagonal) covariance matrices. Furthermore, selection of grouped variables via inclusion or exclusion of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
