TL;DR
This paper investigates how dimensionality reduction, specifically PCA-based methods, can improve the robustness and generalization of overparameterized linear regression models, challenging the necessity of overparameterization for good generalization.
Contribution
It demonstrates that PCA-based dimensionality reduction can prevent risk divergence in overparameterized models and compares various projection methods both theoretically and empirically.
Findings
PCA-OLS improves robustness against adversarial attacks.
Data-dependent projections outperform data-independent ones.
Overparameterization is not essential for good generalization.
Abstract
Overparameterization in deep learning is powerful: Very large models fit the training data perfectly and yet often generalize well. This realization brought back the study of linear models for regression, including ordinary least squares (OLS), which, like deep learning, shows a "double-descent" behavior: (1) The risk (expected out-of-sample prediction error) can grow arbitrarily when the number of parameters approaches the number of samples , and (2) the risk decreases with for , sometimes achieving a lower value than the lowest risk for . The divergence of the risk for OLS can be avoided with regularization. In this work, we show that for some data models it can also be avoided with a PCA-based dimensionality reduction (PCA-OLS, also known as principal component regression). We provide non-asymptotic bounds for the risk of PCA-OLS by considering the alignments of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
