Learning through atypical "phase transitions" in overparameterized neural networks
Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, Rosalba, Pacelli, Gabriele Perugini, Riccardo Zecchina

TL;DR
This paper uses statistical physics methods to analyze how overparameterized neural networks undergo phase transitions that lead to the emergence of atypical, solution-dense regions with good generalization, explaining their learning capabilities.
Contribution
It introduces an analytical framework to study phase transitions in overparameterized neural networks, highlighting the importance of atypical solutions for effective learning.
Findings
A second phase transition leads to solution-dense regions with good generalization.
Efficient algorithms tend to sample rare, atypical solutions.
Numerical tests support the theoretical scenario.
Abstract
Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of prediction accuracy without overfitting. These are formidable results that defy predictions of statistical learning and pose conceptual challenges for non-convex optimization. In this paper, we use methods from statistical physics of disordered systems to analytically study the computational fallout of overparameterization in non-convex binary neural network models, trained on data generated from a structurally simpler but "hidden" network. As the number of connection weights increases, we follow the changes of the geometrical structure of different minima of the error loss function and relate them to learning and generalization performance. A first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications · Statistical Mechanics and Entropy
