A Risk Comparison of Ordinary Least Squares vs Ridge Regression
Paramveer S. Dhillon, Dean P. Foster, Sham M. Kakade, Lyle H. Ungar

TL;DR
This paper compares the risk of ridge regression to a PCA-based least squares method, showing their risks are within a constant factor, highlighting the relative performance of these approaches.
Contribution
It provides a theoretical risk comparison between ridge regression and a PCA-based least squares approach, establishing their risk bounds.
Findings
Risk of PCA-based least squares is within 4 times the risk of ridge regression.
Theoretical analysis quantifies the relationship between the two methods.
Provides insights into the effectiveness of simple regularization techniques.
Abstract
We compare the risk of ridge regression to a simple variant of ordinary least squares, in which one simply projects the data onto a finite dimensional subspace (as specified by a Principal Component Analysis) and then performs an ordinary (un-regularized) least squares regression in this subspace. This note shows that the risk of this ordinary least squares method is within a constant factor (namely 4) of the risk of ridge regression.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms · Neural Networks and Applications · Fault Detection and Control Systems
