Edgeworth correction for the largest eigenvalue in a spiked PCA model

Jeha Yang; Iain M. Johnstone

arXiv:1710.06899·math.ST·October 20, 2017

Edgeworth correction for the largest eigenvalue in a spiked PCA model

Jeha Yang, Iain M. Johnstone

PDF

Open Access

TL;DR

This paper develops Edgeworth corrections to improve the approximation of the distribution of the largest eigenvalue in high-dimensional spiked PCA models, accounting for high-dimensional effects and skewness.

Contribution

It introduces Edgeworth correction formulas for the largest eigenvalue distribution in high-dimensional spiked PCA, incorporating high-dimensional structure and skewness.

Findings

01

Edgeworth corrections improve Gaussian approximation accuracy

02

Coefficients reflect high-dimensional structure

03

Method accounts for fluctuations of noise eigenvalues

Abstract

We study improved approximations to the distribution of the largest eigenvalue $\hat{ℓ}$ of the sample covariance matrix of $n$ zero-mean Gaussian observations in dimension $p + 1$ . We assume that one population principal component has variance $ℓ > 1$ and the remaining `noise' components have common variance $1$ . In the high dimensional limit $p / n \to γ > 0$ , we begin study of Edgeworth corrections to the limiting Gaussian distribution of $\hat{ℓ}$ in the supercritical case $ℓ > 1 + γ$ . The skewness correction involves a quadratic polynomial as in classical settings, but the coefficients reflect the high dimensional structure. The methods involve Edgeworth expansions for sums of independent non-identically distributed variates obtained by conditioning on the sample noise eigenvalues, and limiting bulk properties \textit{and} fluctuations of these noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Statistical Methods and Bayesian Inference · Stochastic processes and statistical mechanics