Rigorous dynamical mean field theory for stochastic gradient descent methods

Cedric Gerbelot; Emanuele Troiani; Francesca Mignacco; Florent Krzakala; Lenka Zdeborova

arXiv:2210.06591·math-ph·November 24, 2025·6 cites

Rigorous dynamical mean field theory for stochastic gradient descent methods

Cedric Gerbelot, Emanuele Troiani, Francesca Mignacco, Florent Krzakala, Lenka Zdeborova

PDF

Open Access 1 Repo

TL;DR

This paper derives exact high-dimensional asymptotic equations for first-order gradient methods like SGD, connecting them to dynamical mean-field theory from physics, and providing tools for analyzing their behavior on Gaussian data.

Contribution

It introduces a rigorous framework linking gradient descent algorithms to dynamical mean-field theory, including non-separable updates and datasets with non-identity covariance.

Findings

01

Derived closed-form equations for high-dimensional asymptotics of gradient methods

02

Connected these equations to dynamical mean-field theory from physics

03

Provided numerical implementations for SGD with various batch sizes and learning rates

Abstract

We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e.g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk minimization. This includes widely used algorithms such as stochastic gradient descent (SGD) or Nesterov acceleration. The obtained equations match those resulting from the discretization of dynamical mean-field theory (DMFT) equations from statistical physics when applied to gradient flow. Our proof method allows us to give an explicit description of how memory kernels build up in the effective dynamics, and to include non-separable update functions, allowing datasets with non-identity covariance matrices. Finally, we provide numerical implementations of the equations for SGD with generic extensive batch-size and with constant learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spoc-group/rigorous-dynamical-mean-field-theory
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Markov Chains and Monte Carlo Methods

MethodsStochastic Gradient Descent