Dimensionality reduction, regularization, and generalization in   overparameterized regressions

Ningyuan Huang; David W. Hogg; Soledad Villar

arXiv:2011.11477·stat.ML·April 7, 2022

Dimensionality reduction, regularization, and generalization in overparameterized regressions

Ningyuan Huang, David W. Hogg, Soledad Villar

PDF

1 Repo

TL;DR

This paper investigates how dimensionality reduction, specifically PCA-based methods, can improve the robustness and generalization of overparameterized linear regression models, challenging the necessity of overparameterization for good generalization.

Contribution

It demonstrates that PCA-based dimensionality reduction can prevent risk divergence in overparameterized models and compares various projection methods both theoretically and empirically.

Findings

01

PCA-OLS improves robustness against adversarial attacks.

02

Data-dependent projections outperform data-independent ones.

03

Overparameterization is not essential for good generalization.

Abstract

Overparameterization in deep learning is powerful: Very large models fit the training data perfectly and yet often generalize well. This realization brought back the study of linear models for regression, including ordinary least squares (OLS), which, like deep learning, shows a "double-descent" behavior: (1) The risk (expected out-of-sample prediction error) can grow arbitrarily when the number of parameters $p$ approaches the number of samples $n$ , and (2) the risk decreases with $p$ for $p > n$ , sometimes achieving a lower value than the lowest risk for $p < n$ . The divergence of the risk for OLS can be avoided with regularization. In this work, we show that for some data models it can also be avoided with a PCA-based dimensionality reduction (PCA-OLS, also known as principal component regression). We provide non-asymptotic bounds for the risk of PCA-OLS by considering the alignments of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nhuang37/dimensionality_reduction
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.