Minimax-Optimal Spectral Clustering with Covariance Projection for High-Dimensional Anisotropic Mixtures
Chengzhu Huang, Yuqi Gu

TL;DR
This paper introduces COPO, a spectral clustering method tailored for high-dimensional anisotropic mixtures, achieving minimax-optimal misclustering rates and demonstrating superior empirical performance across diverse noise structures.
Contribution
The paper develops COPO, a novel covariance-aware spectral clustering algorithm with theoretical guarantees and minimax lower bounds for high-dimensional anisotropic mixture models.
Findings
COPO attains minimax-optimal misclustering rates in Gaussian settings.
The method outperforms existing clustering techniques in simulations.
Theoretical analysis confirms tight performance guarantees.
Abstract
In mixture models, anisotropic noise within each cluster is widely present in real-world data. This work investigates both computationally efficient procedures and fundamental statistical limits for clustering in high-dimensional anisotropic mixtures. We propose a new clustering method, Covariance Projected Spectral Clustering (COPO), which adapts to a wide range of dependent noise structures. We first project the data onto a low-dimensional space via eigen-decomposition of a diagonal-deleted Gram matrix. Our central methodological idea is to sharpen clustering in this embedding space by a covariance-aware reassignment step, using quadratic distances induced by estimated projected covariances. Through a novel row-wise analysis of the subspace estimation step in weak-signal regimes, which is of independent interest, we establish tight performance guarantees and algorithmic upper bounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
MethodsSpectral Clustering
