Dynamics of neural scaling laws in random feature regression with powerlaw-distributed kernel eigenvalues
Jakob Kramp, Javed Lindner, Moritz Helias

TL;DR
This paper develops a dynamical mean-field theory to understand the generalization behavior of neural networks in high dimensions, unifying various learning regimes and explaining neural scaling laws through spectral properties.
Contribution
It introduces a unified theoretical framework that captures the dynamics of neural network learning across different regimes using statistical physics tools.
Findings
Explains neural scaling laws through spectral properties of data.
Unifies Bayesian inference, gradient flow, and Langevin dynamics in a single model.
Quantitatively links spectral data with generalization error dynamics.
Abstract
Training large neural networks exposes neural scaling laws for the generalization error, which points to a universal behavior across network architectures of learning in high dimensions. It was also shown that this effect persists in the limit of highly overparametrized networks as well as the Neural network Gaussian process limit. We here develop a principled understanding of the typical behavior of generalization in Neural Network Gaussian process regression dynamics. We derive a dynamical mean-field theory that captures the typical case learning dynamics: This allows us to unify multiple existing regimes of learning studied in the current literature, namely Bayesian inference on Gaussian processes, gradient flow with or without weight-decay, and stochastic Langevin training dynamics. Employing tools from statistical physics, the unified framework we derive in either of these cases…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques · Machine Learning in Materials Science
