Renormalizable Spectral-Shell Dynamics as the Origin of Neural Scaling Laws
Yizhou Zhang

TL;DR
This paper derives a macroscopic spectral-shell framework from gradient descent in neural networks, explaining neural scaling laws and double descent phenomena through a unified, self-similar dynamical model.
Contribution
It introduces a renormalizable spectral-shell dynamics approach that unifies lazy training and feature learning, providing a theoretical foundation for neural scaling laws.
Findings
Explains neural scaling laws and double descent phenomena.
Unifies lazy training and feature learning.
Provides explicit scaling exponents and self-similar solutions.
Abstract
Neural scaling laws and double-descent phenomena suggest that deep-network training obeys a simple macroscopic structure despite highly nonlinear optimization dynamics. We derive such structure directly from gradient descent in function space. For mean-squared error loss, the training error evolves as with , a time-dependent self-adjoint operator induced by the network Jacobian. Using Kato perturbation theory, we obtain an exact system of coupled modewise ODEs in the instantaneous eigenbasis of . To extract macroscopic behavior, we introduce a logarithmic spectral-shell coarse-graining and track quadratic error energy across shells. Microscopic interactions within each shell cancel identically at the energy level, so shell energies evolve only through dissipation and external inter-shell interactions. We formalize this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Quantum many-body systems · Stochastic Gradient Optimization Techniques
