On the Minimal Error of Empirical Risk Minimization

Gil Kur; Alexander Rakhlin

arXiv:2102.12066·math.ST·February 25, 2021

On the Minimal Error of Empirical Risk Minimization

Gil Kur, Alexander Rakhlin

PDF

Open Access

TL;DR

This paper investigates the fundamental limits of Empirical Risk Minimization in regression, revealing how data design influences the ability to adapt to simpler models and providing sharp bounds for various function classes.

Contribution

It offers new sharp lower bounds on ERM's minimal error in regression, highlighting differences between fixed and random design settings and their implications for model simplicity adaptation.

Findings

01

In fixed design, error depends on the global complexity of the class.

02

In random design, ERM adapts only if local neighborhoods are nearly as complex as the class.

03

Provides sharp bounds for both Donsker and non-Donsker classes.

Abstract

We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM for both Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Probabilistic and Robust Engineering Design · Control Systems and Identification