Simplicity bias and optimization threshold in two-layer ReLU networks
Etienne Boursier, Nicolas Flammarion

TL;DR
This paper theoretically investigates how two-layer ReLU networks tend to favor simpler solutions over interpolating data, especially beyond an optimization threshold, which improves generalization in complex tasks.
Contribution
It introduces a theoretical framework explaining the simplicity bias and the optimization threshold in overparametrized two-layer ReLU networks, highlighting their impact on generalization.
Findings
Networks often converge to simpler solutions rather than interpolating data.
Early alignment phase causes neurons to align towards specific directions.
Simplicity bias enhances generalization beyond the interpolation regime.
Abstract
Understanding generalization of overparametrized neural networks remains a fundamental challenge in machine learning. Most of the literature mostly studies generalization from an interpolation point of view, taking convergence of parameters towards a global minimum of the training loss for granted. While overparametrized architectures indeed interpolated the data for typical classification tasks, this interpolation paradigm does not seem valid anymore for more complex tasks such as in-context learning or diffusion. Instead for such tasks, it has been empirically observed that the trained models goes from global minima to spurious local minima of the training loss as the number of training samples becomes larger than some level we call optimization threshold. While the former yields a poor generalization to the true population loss, the latter was observed to actually correspond to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced MIMO Systems Optimization · Energy Efficient Wireless Sensor Networks · Advanced Optical Network Technologies
Methods*Communicated@Fast*How Do I Communicate to Expedia? · ALIGN
