TL;DR
This paper draws an analogy between the loss landscape of deep neural networks and the energy landscape of repulsive ellipses, revealing a phase transition akin to jamming that explains the behavior of minima and the network's ability to fit data.
Contribution
It introduces a phase transition framework for understanding neural network loss landscapes, linking it to jamming phenomena and providing insights into the effects of overparameterization and depth.
Findings
Loss landscape properties are critical near the phase transition.
Overparameterization prevents poor minima, independent of depth.
Learning dynamics exhibit avalanche-like behavior near the transition.
Abstract
Deep learning has been immensely successful at a variety of tasks, ranging from classification to AI. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function. Understanding under which conditions neural networks do not get stuck in poor minima of the loss, and how the landscape of that loss evolves as depth is increased remains a challenge. Here we predict, and test empirically, an analogy between this landscape and the energy landscape of repulsive ellipses. We argue that in FC networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. In the vicinity of this transition, properties of the curvature of the minima of the loss are critical. This transition shares direct similarities with the jamming transition by which particles form a disordered solid as the density…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
