High-dimensional analysis of semidefinite relaxations for sparse principal components
Arash A. Amini, Martin J. Wainwright

TL;DR
This paper investigates the high-dimensional sparse PCA problem, analyzing the performance of simple thresholding and semidefinite programming methods, and establishing fundamental limits on support recovery based on sample size and sparsity.
Contribution
It introduces a high-dimensional analysis of semidefinite relaxations for sparse PCA, providing thresholds for success and showing computational-statistical trade-offs.
Findings
SDP relaxation succeeds above a specific sample size threshold
Diagonal thresholding method has a different success threshold
No method can succeed below a certain fundamental limit
Abstract
Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the ``large , small '' setting, in which the problem dimension is comparable to or larger than the sample size . This paper studies PCA in this high-dimensional regime, but under the additional assumption that the maximal eigenvector is sparse, say, with at most nonzero components. We consider a spiked covariance model in which a base matrix is perturbed by adding a -sparse maximal eigenvector, and we analyze two computationally tractable methods for recovering the support set of this maximal eigenvector, as follows: (a) a simple diagonal thresholding method, which transitions from success to failure as a function of the rescaled sample size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
