Interpolation can hurt robust generalization even when there is no noise

Konstantin Donhauser; Alexandru \c{T}ifrea; Michael Aerni; Reinhard; Heckel; Fanny Yang

arXiv:2108.02883·stat.ML·December 20, 2021·1 cites

Interpolation can hurt robust generalization even when there is no noise

Konstantin Donhauser, Alexandru \c{T}ifrea, Michael Aerni, Reinhard, Heckel, Fanny Yang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper demonstrates that in high-dimensional settings, avoiding interpolation with ridge regularization can enhance robust generalization, challenging the belief that overparameterization always reduces variance beneficially.

Contribution

It provides the first theoretical analysis showing that ridge regularization can improve robust generalization even without noise, highlighting a new aspect of overfitting.

Findings

01

Ridge regularization can improve robust risk in noise-free settings.

02

Overparameterization does not always lead to better generalization.

03

Theoretical results apply to both linear regression and classification.

Abstract

Numerous recent works show that overparameterization implicitly reduces variance for min-norm interpolators and max-margin classifiers. These findings suggest that ridge regularization has vanishing benefits in high dimensions. We challenge this narrative by showing that, even in the absence of noise, avoiding interpolation through ridge regularization can significantly improve generalization. We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Interpolation can hurt robust generalization even when there is no noise· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Face and Expression Recognition · Stochastic Gradient Optimization Techniques

MethodsLogistic Regression · Linear Regression