Understanding overfitting peaks in generalization error: Analytical risk curves for $l_2$ and $l_1$ penalized interpolation
Partha P Mitra

TL;DR
This paper analyzes overfitting peaks in generalization error for $l_2$ and $l_1$ penalized interpolation, introducing a model that dissociates overfitting peaks from the model's ability to match data, with analytical formulas for GE curves.
Contribution
It introduces the MiSpaR model to distinguish overfitting peaks from data fitting capability and derives analytical formulas for GE curves with different penalties in the interpolation limit.
Findings
Overfitting peaks can occur independently of data fitting.
Analytical formulas for GE curves reveal differences between $l_2$ and $l_1$ penalties.
Overfitting peaks do not necessarily indicate a transition from classical to modern regimes.
Abstract
Traditionally in regression one minimizes the number of fitting parameters or uses smoothing/regularization to trade training (TE) and generalization error (GE). Driving TE to zero by increasing fitting degrees of freedom (dof) is expected to increase GE. However modern big-data approaches, including deep nets, seem to over-parametrize and send TE to zero (data interpolation) without impacting GE. Overparametrization has the benefit that global minima of the empirical loss function proliferate and become easier to find. These phenomena have drawn theoretical attention. Regression and classification algorithms have been shown that interpolate data but also generalize optimally. An interesting related phenomenon has been noted: the existence of non-monotonic risk curves, with a peak in GE with increasing dof. It was suggested that this peak separates a classical regime from a modern…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Mathematical functions and polynomials · Numerical methods in inverse problems
