Spectrum Estimation: A Unified Framework for Covariance Matrix Estimation and PCA in Large Dimensions
Olivier Ledoit, Michael Wolf

TL;DR
This paper introduces a unified framework for covariance matrix estimation and PCA in high-dimensional settings, utilizing nonlinear eigenvalue shrinkage and consistent spectrum estimation to improve accuracy over traditional methods.
Contribution
It develops a novel, asymptotically consistent approach for eigenvalue spectrum estimation, enabling optimal nonlinear shrinkage for covariance and PCA in large dimensions.
Findings
Methods outperform previous approaches in simulations
Achieves consistent estimation of the population spectrum
Enhances covariance and PCA accuracy in high-dimensional data
Abstract
Covariance matrix estimation and principal component analysis (PCA) are two cornerstones of multivariate analysis. Classic textbook solutions perform poorly when the dimension of the data is of a magnitude similar to the sample size, or even larger. In such settings, there is a common remedy for both statistical problems: nonlinear shrinkage of the eigenvalues of the sample covariance matrix. The optimal nonlinear shrinkage formula depends on unknown population quantities and is thus not available. It is, however, possible to consistently estimate an oracle nonlinear shrinkage, which is motivated on asymptotic grounds. A key tool to this end is consistent estimation of the set of eigenvalues of the population covariance matrix (also known as the spectrum), an interesting and challenging problem in its own right. Extensive Monte Carlo simulations demonstrate that our methods have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
