High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

Zhou Fan; Leda Wang

arXiv:2601.21093·stat.ML·February 19, 2026

High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

Zhou Fan, Leda Wang

PDF

Open Access

TL;DR

This paper provides an exact asymptotic analysis of the learning dynamics of multi-pass mini-batch SGD in high-dimensional multi-index models, revealing the effects of batch size and learning rate scaling.

Contribution

It introduces a novel dynamical mean-field framework and scalar Poisson jump process to characterize SGD dynamics in high dimensions, extending existing models.

Findings

01

SGD dynamics are invariant across batch size scalings within [0,1)

02

SGD, SME, and gradient flow have distinct dynamics under certain scalings

03

The analysis recovers known results for gradient flow and online SGD in specific limits

Abstract

We study the learning dynamics of a multi-pass, mini-batch Stochastic Gradient Descent (SGD) procedure for empirical risk minimization in high-dimensional multi-index models with isotropic random data. In an asymptotic regime where the sample size $n$ and data dimension $d$ increase proportionally, for any sub-linear batch size $κ ≍ n^{α}$ where $α \in [0, 1)$ , and for a commensurate ``critical'' scaling of the learning rate, we provide an asymptotically exact characterization of the coordinate-wise dynamics of SGD. This characterization takes the form of a system of dynamical mean-field equations, driven by a scalar Poisson jump process that represents the asymptotic limit of SGD sampling noise. We develop an analogous characterization of the Stochastic Modified Equation (SME) which provides a Gaussian diffusion approximation to SGD. Our analyses imply that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference