Calibrated Principal Component Regression

Yixuan Florence Wu; Yilun Zhu; Lei Cao; Naichen Shi

arXiv:2510.19020·stat.ML·April 27, 2026

Calibrated Principal Component Regression

Yixuan Florence Wu, Yilun Zhu, Lei Cao, Naichen Shi

PDF

TL;DR

This paper introduces Calibrated Principal Component Regression (CPCR), a novel method that reduces bias and improves prediction in high-dimensional generalized linear models by calibrating PCR with a Tikhonov step.

Contribution

The paper proposes CPCR, which enhances PCR by calibrating in the original feature space, controlling bias, and providing theoretical risk analysis in overparameterized regimes.

Findings

01

CPCR outperforms standard PCR in low-variance directions.

02

Empirical results show CPCR consistently improves prediction in overparameterized problems.

03

Theoretical analysis confirms CPCR's risk is lower than PCR's in certain regimes.

Abstract

We propose a new method for statistical inference in generalized linear models. In the overparameterized regime, Principal Component Regression (PCR) reduces variance by projecting high-dimensional data to a low-dimensional principal subspace before fitting. However, PCR incurs truncation bias whenever the true regression vector has mass outside the retained principal components (PC). To mitigate the bias, we propose Calibrated Principal Component Regression (CPCR), which first learns a low-variance prior in the PC subspace and then calibrates the model in the original feature space via a centered Tikhonov step. CPCR leverages cross-fitting and controls the truncation bias by softening PCR's hard cutoff. Theoretically, we calculate the out-of-sample risk in the random matrix regime, which shows that CPCR outperforms standard PCR when the regression signal has non-negligible components…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.