Sketching for Principal Component Regression
Liron Mor-Yosef, Haim Avron

TL;DR
This paper introduces efficient algorithms for approximate principal component regression (PCR) that significantly reduce computational costs while maintaining high accuracy, applicable to large-scale, streaming, and kernel data scenarios.
Contribution
The paper presents input sparsity time algorithms for approximate PCR, including streaming and kernel variants, with theoretical risk bounds and empirical validation.
Findings
Algorithms achieve high-quality approximations with low computational complexity
Empirical results show excellent performance on large-scale data
Methods are applicable to streaming and kernel PCR settings
Abstract
Principal component regression (PCR) is a useful method for regularizing linear regression. Although conceptually simple, straightforward implementations of PCR have high computational costs and so are inappropriate when learning with large scale data. In this paper, we propose efficient algorithms for computing approximate PCR solutions that are, on one hand, high quality approximations to the true PCR solutions (when viewed as minimizer of a constrained optimization problem), and on the other hand entertain rigorous risk bounds (when viewed as statistical estimators). In particular, we propose an input sparsity time algorithms for approximate PCR. We also consider computing an approximate PCR in the streaming model, and kernel PCR. Empirical results demonstrate the excellent performance of our proposed methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
