Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap
Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborov\'a

TL;DR
This paper characterizes the fundamental limits and algorithmic performance for high-dimensional subspace clustering with sparse cluster means, revealing a gap between what is statistically possible and what polynomial algorithms can achieve.
Contribution
It provides an exact asymptotic analysis of the reconstruction error, thresholds for statistical impossibility, and explores the gap between information-theoretic limits and polynomial-time algorithms in high-dimensional sparse clustering.
Findings
Identified the information-theoretic threshold for cluster recovery.
Analyzed the performance of AMP and established a statistical-to-computational gap.
Compared AMP with other algorithms like sparse-PCA for different sparsity regimes.
Abstract
A simple model to study subspace clustering is the high-dimensional -Gaussian mixture model where the cluster means are sparse vectors. Here we provide an exact asymptotic characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity, i.e. when the fraction of non-zero components of the cluster means , as well as the ratio between the number of samples and the dimension are fixed, while the dimension diverges. We identify the information-theoretic threshold below which obtaining a positive correlation with the true cluster means is statistically impossible. Additionally, we investigate the performance of the approximate message passing (AMP) algorithm analyzed via its state evolution, which is conjectured to be optimal among polynomial algorithm for this task. We identify in particular the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
MethodsAdversarial Model Perturbation
