Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer

Clarissa Lauditi; Cengiz Pehlevan; Blake Bordelon

arXiv:2605.07870·cond-mat.dis-nn·May 22, 2026

Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer

Clarissa Lauditi, Cengiz Pehlevan, Blake Bordelon

PDF

TL;DR

This paper develops a spectral dynamical mean-field theory to analyze how hidden-weight spectra evolve during training in wide neural networks, revealing insights into feature learning, outlier dynamics, and hyperparameter transfer.

Contribution

It introduces a two-level DMFT framework to jointly track bulk and outlier spectral dynamics in wide neural networks, providing new theoretical insights into training behavior and spectral evolution.

Findings

01

Outlier dynamics depend on network width, initialization, and training time.

02

Width-stable growth of NTK mode occurs at the edge of stability in $$P networks.

03

Large output tasks involve spectral restructuring beyond simple bulk+outlier models.

Abstract

We study the evolution of hidden-weight spectra in wide neural networks trained by (stochastic) gradient descent. We develop a two-level dynamical mean-field theory (DMFT) that jointly tracks bulk and outlier spectral dynamics for spiked ensembles whose spike directions remain statistically dependent on the random bulk. We apply this framework to two settings: (1) infinite-width nonlinear networks in mean-field/ $μ$ P scaling and (2) deep linear networks in the proportional high-dimensional limit, where width, input dimension, and sample size diverge with fixed ratios. Our theory predicts how outliers evolve with training time, width, output scale, and initialization variance. In deep linear networks, $μ$ P yields width-consistent outlier dynamics and hyperparameter transfer, including width-stable growth of the leading NTK mode toward the edge of stability (EoS). In contrast, NTK…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.