Minimax bounds for sparse PCA with noisy high-dimensional data
Aharon Birnbaum, Iain M. Johnstone, Boaz Nadler, Debashis Paul

TL;DR
This paper establishes fundamental limits on estimating sparse leading eigenvectors in high-dimensional noisy data and introduces a new two-stage coordinate selection method for improved estimation.
Contribution
It provides the first minimax risk lower bounds for sparse PCA with noisy high-dimensional data and proposes a novel estimation approach.
Findings
Lower bounds reveal different sparsity regimes affecting estimation difficulty.
The proposed method outperforms existing techniques in certain regimes.
Results highlight the importance of sparsity structure in high-dimensional PCA.
Abstract
We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish a lower bound on the minimax risk of estimators under the loss, in the joint limit as dimension and sample size increase to infinity, under various models of sparsity for the population eigenvectors. The lower bound on the risk points to the existence of different regimes of sparsity of the eigenvectors. We also propose a new method for estimating the eigenvectors by a two-stage coordinate selection scheme.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Statistical Methods and Inference · Bayesian Methods and Mixture Models
