Tandem clustering with invariant coordinate selection
Andreas Alfons, Aurore Archimbaud, Klaus Nordhausen, Anne Ruiz-Gazen

TL;DR
This paper introduces a novel tandem clustering method using invariant coordinate selection (ICS) that improves cluster detection by better preserving data structure during dimension reduction, outperforming PCA-based methods especially with outliers.
Contribution
The paper proposes a new ICS-based tandem clustering approach that addresses PCA limitations by jointly diagonalizing scatter matrices for better structure preservation.
Findings
ICS-based tandem clustering outperforms PCA in preserving cluster structure.
Using specific scatter matrix pairs enhances clustering performance.
The method is effective even with outliers in the data.
Abstract
For multivariate data, tandem clustering is a well-known technique aiming to improve cluster identification through initial dimension reduction. Nevertheless, the usual approach using principal component analysis (PCA) has been criticized for focusing solely on inertia so that the first components do not necessarily retain the structure of interest for clustering. To address this limitation, a new tandem clustering approach based on invariant coordinate selection (ICS) is proposed. By jointly diagonalizing two scatter matrices, ICS is designed to find structure in the data while providing affine invariant components. Certain theoretical results have been previously derived and guarantee that under some elliptical mixture models, the group structure can be highlighted on a subset of the first and/or last components. However, ICS has garnered minimal attention within the context of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Data-Driven Disease Surveillance
