Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario
Florentin Coeurdoux, Gr\'egoire Ferr\'e, Jean-Philippe Bouchaud

TL;DR
This paper models the transient detection of signals in early-stopped gradient flow using random matrix theory, revealing how anisotropy and noise influence spectral transitions and early stopping.
Contribution
It introduces an analytically tractable random matrix model capturing the transient spectral phenomena in early-stopped gradient flow with anisotropic input covariance.
Findings
Transient eigenvalue separation indicates signal detection before overfitting.
Derived full time-dependent bulk spectrum using a 2x2 Dyson equation.
Mapped phase diagrams showing conditions for signal emergence and reabsorption.
Abstract
Empirical studies of trained models often report a transient regime in which signal is detectable in a finite gradient descent time window before overfitting dominates. We provide an analytically tractable random-matrix model that reproduces this phenomenon for gradient flow in a linear teacher--student setting. In this framework, learning occurs when an isolated eigenvalue separates from a noisy bulk, before eventually disappearing in the overfitting regime. The key ingredient is anisotropy in the input covariance, which induces fast and slow directions in the learning dynamics. In a two-block covariance model, we derive the full time-dependent bulk spectrum of the symmetrized weight matrix through a Dyson equation, and we obtain an explicit outlier condition for a rank-one teacher via a rank-two determinant formula. This yields a transient Baik-Ben Arous-P\'ech\'e (BBP)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
