Phase transitions reveal hierarchical structure in deep neural networks
Ibrahim Talha Ersoy, Andr\'es Fernando Cardozo Licha, Karoline Wiesner

TL;DR
This paper uncovers how phase transitions, saddle points, and mode connectivity in deep neural networks are interconnected through the geometry of their loss landscapes, revealing a hierarchical structure akin to phases in physics.
Contribution
It provides a unified geometric framework explaining key phenomena in DNN training, introduces a new algorithm to probe error landscape geometry, and demonstrates the hierarchical structure of accuracy basins.
Findings
Phase transitions are governed by saddle points in the loss landscape.
A new algorithm effectively finds paths connecting global minima.
Saddle points induce transitions between models with different digit class encodings.
Abstract
Training Deep Neural Networks relies on the model converging on a high-dimensional, non-convex loss landscape toward a good minimum. Yet, much of the phenomenology of training remains ill understood. We focus on three seemingly disparate observations: the occurrence of phase transitions reminiscent of statistical physics, the ubiquity of saddle points, and phenomenon of mode connectivity relevant for model merging. We unify these within a single explanatory framework, the geometry of the loss and error landscapes. We analytically show that phase transitions in DNN learning are governed by saddle points in the loss landscape. Building on this insight, we introduce a simple, fast, and easy to implement algorithm that uses the L2 regularizer as a tool to probe the geometry of error landscapes. We apply it to confirm mode connectivity in DNNs trained on the MNIST dataset by efficiently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Quantum many-body systems · Stochastic Gradient Optimization Techniques
