On the Minimal Error of Empirical Risk Minimization
Gil Kur, Alexander Rakhlin

TL;DR
This paper investigates the fundamental limits of Empirical Risk Minimization in regression, revealing how data design influences the ability to adapt to simpler models and providing sharp bounds for various function classes.
Contribution
It offers new sharp lower bounds on ERM's minimal error in regression, highlighting differences between fixed and random design settings and their implications for model simplicity adaptation.
Findings
In fixed design, error depends on the global complexity of the class.
In random design, ERM adapts only if local neighborhoods are nearly as complex as the class.
Provides sharp bounds for both Donsker and non-Donsker classes.
Abstract
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM for both Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Probabilistic and Robust Engineering Design · Control Systems and Identification
