Spherical Boltzmann machines: a solvable theory of learning and generation in energy-based models
Thomas Tulinski, Simona Cocco, R\'emi Monasson, Jorge Fernandez-De-Cossio-Diaz

TL;DR
This paper provides an exact analytical study of the spherical Boltzmann machine, revealing phase transitions and training dynamics in energy-based models using tools from physics and random matrix theory.
Contribution
It introduces a solvable high-dimensional model of energy-based learning, connecting phase transitions to generative phenomena and training biases.
Findings
Exact equations for training dynamics of SBM
Identification of phase transitions during training
Connection of transitions to sampling and regularization effects
Abstract
Energy-based models (EBMs) are flexible generative architectures inspired by statistical physics, but their learning and generative properties remain poorly understood. Here, we analyze a solvable EBM in the high-dimensional limit: the spherical Boltzmann machine (SBM). Combining tools from random matrix theory and dynamical mean-field theory, we: solve exact equations describing the training dynamics of the SBM; compute the Bayesian evidence, which acts as a partition function in parameter space and encodes global properties of the trained model; and uncover cascades of phase transitions that occur both during training and as a function of hyperparameters, related to successive alignment and condensation of the top modes of the coupling matrix to the data. We connect these transitions to sampling-time generative phenomena in a teacher-student scenario, including: sampling temperature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
