Finite sample approximation results for principal component analysis: a   matrix perturbation approach

Boaz Nadler

arXiv:0901.3245·math.ST·January 22, 2009

Finite sample approximation results for principal component analysis: a matrix perturbation approach

Boaz Nadler

PDF

TL;DR

This paper provides finite sample bounds and a matrix perturbation perspective on PCA eigenvalues and eigenvectors, analyzing their relation to population PCA and phase transition phenomena in high-dimensional settings.

Contribution

It introduces a nonasymptotic, high-probability theorem for sample PCA eigenvalues and eigenvectors under a spiked covariance model, and offers a matrix perturbation view of phase transitions.

Findings

01

Finite sample bounds for PCA eigenvalues and eigenvectors.

02

Analysis of phase transition and eigenvector loss in high-dimensional PCA.

03

Eigenvector stability depends on noise level and sample size.

Abstract

Principal component analysis (PCA) is a standard tool for dimensional reduction of a set of $n$ observations (samples), each with $p$ variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size $n$ , and those of the limiting population PCA as $n \to \infty$ . As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit $p, n \to \infty$ , with $p / n = c$ . We present a matrix perturbation view of the "phase transition phenomenon," and a simple linear-algebra based derivation of the eigenvalue and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.