Dimension reduction for model-based clustering
Luca Scrucca

TL;DR
This paper presents a dimension reduction technique tailored for visualizing clustering structures in high-dimensional data using Gaussian mixture models, enhancing interpretability and noise robustness.
Contribution
It introduces a novel linear combination-based dimension reduction method that captures most clustering information from Gaussian mixture models for improved visualization.
Findings
Effective in high-dimensional settings
Produces clear visual summaries of clusters
Applicable to both simulated and real data
Abstract
We introduce a dimension reduction method for visualizing the clustering structure obtained from a finite mixture of Gaussian densities. Information on the dimension reduction subspace is obtained from the variation on group means and, depending on the estimated mixture model, on the variation on group covariances. The proposed method aims at reducing the dimensionality by identifying a set of linear combinations, ordered by importance as quantified by the associated eigenvalues, of the original features which capture most of the cluster structure contained in the data. Observations may then be projected onto such a reduced subspace, thus providing summary plots which help to visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure. The new constructed variables capture most of the clustering information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
