Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models
Amelia Perry, Alexander S. Wein, Afonso S. Bandeira, Ankur, Moitra

TL;DR
This paper investigates the limits of principal component analysis (PCA) in detecting spikes in random matrix models, revealing conditions where PCA is optimal or sub-optimal, and proposing improved detection methods.
Contribution
It characterizes the optimal detection thresholds for PCA in various spiked random matrix models and introduces a pre-transformation technique to improve detection in non-Gaussian cases.
Findings
PCA achieves optimal detection in Gaussian Wigner models for certain priors.
PCA is sub-optimal in non-Gaussian Wigner models, but a pre-transformed PCA achieves optimal detection.
In Gaussian Wishart models, PCA is optimal for positive spikes but not always for negative spikes.
Abstract
A central problem of random matrix theory is to understand the eigenvalues of spiked random matrix models, introduced by Johnstone, in which a prominent eigenvector (or "spike") is planted into a random matrix. These distributions form natural statistical models for principal component analysis (PCA) problems throughout the sciences. Baik, Ben Arous and Peche showed that the spiked Wishart ensemble exhibits a sharp phase transition asymptotically: when the spike strength is above a critical threshold, it is possible to detect the presence of a spike based on the top eigenvalue, and below the threshold the top eigenvalue provides no information. Such results form the basis of our understanding of when PCA can detect a low-rank signal in the presence of noise. However, under structural assumptions on the spike, not all information is necessarily contained in the spectrum. We study the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
