Sketching for Principal Component Regression

Liron Mor-Yosef; Haim Avron

arXiv:1803.02661·math.NA·March 8, 2019

Sketching for Principal Component Regression

Liron Mor-Yosef, Haim Avron

PDF

TL;DR

This paper introduces efficient algorithms for approximate principal component regression (PCR) that significantly reduce computational costs while maintaining high accuracy, applicable to large-scale, streaming, and kernel data scenarios.

Contribution

The paper presents input sparsity time algorithms for approximate PCR, including streaming and kernel variants, with theoretical risk bounds and empirical validation.

Findings

01

Algorithms achieve high-quality approximations with low computational complexity

02

Empirical results show excellent performance on large-scale data

03

Methods are applicable to streaming and kernel PCR settings

Abstract

Principal component regression (PCR) is a useful method for regularizing linear regression. Although conceptually simple, straightforward implementations of PCR have high computational costs and so are inappropriate when learning with large scale data. In this paper, we propose efficient algorithms for computing approximate PCR solutions that are, on one hand, high quality approximations to the true PCR solutions (when viewed as minimizer of a constrained optimization problem), and on the other hand entertain rigorous risk bounds (when viewed as statistical estimators). In particular, we propose an input sparsity time algorithms for approximate PCR. We also consider computing an approximate PCR in the streaming model, and kernel PCR. Empirical results demonstrate the excellent performance of our proposed methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.