Statistical mechanics of extensive-width Bayesian neural networks near interpolation
Jean Barbier, Francesco Camilli, Minh-Toan Nguyen, Mauro Pastore, Rudy Skerk

TL;DR
This paper uses statistical physics to analyze the learning dynamics of a realistic two-layer Bayesian neural network with a large hidden layer, revealing complex phase transitions and the conditions for feature learning and specialization.
Contribution
It provides a detailed theoretical analysis of Bayesian neural networks near interpolation, bridging the gap between simple models and practical networks, and uncovers new phenomena in feature learning.
Findings
Feature contribution reduces data needed for learning.
Non-linear combinations dominate when data is scarce.
Specialization occurs with sufficient data but may be computationally hard.
Abstract
For three decades statistical mechanics has been providing a framework to analyse neural networks. However, the theoretically tractable models, e.g., perceptrons, random features models and kernel machines, or multi-index models and committee machines with few neurons, remained simple compared to those used in applications. In this paper we help reducing the gap between practical networks and their theoretical understanding through a statistical physics analysis of the supervised learning of a two-layer fully connected network with generic weight distribution and activation function, whose hidden layer is large but remains proportional to the inputs dimension. This makes it more realistic than infinitely wide networks where no feature learning occurs, but also more expressive than narrow ones or with fixed inner weights. We focus on the Bayes-optimal learning in the teacher-student…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
