A note on the prediction error of principal component regression in high dimensions
Laura Hucker, Martin Wahl

TL;DR
This paper investigates the prediction error of principal component regression (PCR) in high-dimensional settings, providing bounds and insights into its performance relative to an oracle method and the effects of eigenvalue bias.
Contribution
It offers theoretical bounds for PCR's prediction error in high dimensions and analyzes the impact of eigenvalue bias on its regularization properties.
Findings
PCR performs comparably to the oracle method under certain conditions.
Eigenvalue bias leads to a self-regularization effect in PCR.
High probability bounds are established for the squared risk of PCR.
Abstract
We analyze the prediction error of principal component regression (PCR) and prove high probability bounds for the corresponding squared risk conditional on the design. Our first main result shows that PCR performs comparably to the oracle method obtained by replacing empirical principal components by their population counterparts, provided that an effective rank condition holds. On the other hand, if the latter condition is violated, then empirical eigenvalues start to have a significant upward bias, resulting in a self-induced regularization of PCR. Our approach relies on the behavior of empirical eigenvalues, empirical eigenvectors and the excess risk of principal component analysis in high-dimensional regimes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Stochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods
