The jamming transition as a paradigm to understand the loss landscape of   deep neural networks

Mario Geiger; Stefano Spigler; St\'ephane d'Ascoli; Levent Sagun,; Marco Baity-Jesi; Giulio Biroli; Matthieu Wyart

arXiv:1809.09349·cond-mat.dis-nn·July 17, 2019

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

Mario Geiger, Stefano Spigler, St\'ephane d'Ascoli, Levent Sagun,, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart

PDF

2 Repos

TL;DR

This paper draws an analogy between the loss landscape of deep neural networks and the energy landscape of repulsive ellipses, revealing a phase transition akin to jamming that explains the behavior of minima and the network's ability to fit data.

Contribution

It introduces a phase transition framework for understanding neural network loss landscapes, linking it to jamming phenomena and providing insights into the effects of overparameterization and depth.

Findings

01

Loss landscape properties are critical near the phase transition.

02

Overparameterization prevents poor minima, independent of depth.

03

Learning dynamics exhibit avalanche-like behavior near the transition.

Abstract

Deep learning has been immensely successful at a variety of tasks, ranging from classification to AI. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function. Understanding under which conditions neural networks do not get stuck in poor minima of the loss, and how the landscape of that loss evolves as depth is increased remains a challenge. Here we predict, and test empirically, an analogy between this landscape and the energy landscape of repulsive ellipses. We argue that in FC networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. In the vicinity of this transition, properties of the curvature of the minima of the loss are critical. This transition shares direct similarities with the jamming transition by which particles form a disordered solid as the density…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.