Quantifying the Estimation Error of Principal Components
Raphael Hauser, Raul Kangro, J\"uri Lember, Heinrich Matzinger

TL;DR
This paper improves bounds on the estimation error of principal components in PCA, showing that eigenvectors can often be accurately reconstructed from fewer samples than previously thought.
Contribution
It sharpens existing bounds on PCA eigenvector estimation error and demonstrates that accurate reconstruction is possible with smaller sample sizes.
Findings
Sharper bounds on eigenvector estimation error
Eigenvectors can be reconstructed with fewer samples
Improved understanding of PCA sample complexity
Abstract
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance that approximates a population covariance , and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigendecomposition of the proxy covariance rather than the ground-truth , it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The recent results of Kolchinskii and Lounici yield such bounds. In the present paper we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gene expression and cancer classification · Blind Source Separation Techniques
