Can Stationary Distributions of Scale-Invariant Neural Networks Be Described by the Thermodynamics of an Ideal Gas?
Ildus Sadrtdinov, Ekaterina Lobacheva, Ivan Klimov, Mikhail Burtsev, Mikhail I. Katsnelson, Dmitry Vetrov

TL;DR
This paper introduces a thermodynamic framework to analyze the stationary distributions of scale-invariant neural networks trained with SGD, revealing an ideal gas analogy that aligns with empirical observations.
Contribution
It develops a novel thermodynamic perspective on neural network training dynamics, connecting hyperparameters to thermodynamic variables and validating the analogy through theory and experiments.
Findings
SGD dynamics resemble ideal gas behavior under certain conditions
Stationary entropy predictions match experimental data
Hyperparameters can be interpreted as thermodynamic variables
Abstract
Understanding the training dynamics of deep neural networks remains a major open problem, with physics-inspired approaches offering promising insights. Building on this perspective, we develop a thermodynamic framework to describe the stationary distributions of stochastic gradient descent (SGD) with weight decay for scale-invariant neural networks, a setting that both reflects practical architectures with normalization layers and permits theoretical analysis. We establish analogies between training hyperparameters (e.g., learning rate, weight decay) and thermodynamic variables such as temperature, pressure, and volume. Starting with a simplified isotropic noise model, we uncover a close correspondence between SGD dynamics and ideal gas behavior, validated through theory and simulation. Extending to training of neural networks, we show that key predictions of the framework, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
