Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks
Parsa Rangriz

TL;DR
This paper analyzes the high-dimensional behavior of online stochastic gradient descent in single-layer networks, revealing phase transitions and the role of stochastic fluctuations in learning dynamics.
Contribution
It characterizes the critical scaling regime of SGD step size, showing how stochastic effects influence the phase diagram and sample complexity in high dimensions.
Findings
Below critical step size, dynamics are deterministic and ballistic.
At critical scale, a correction term alters the phase diagram.
Near fixed points, dynamics approximate an Ornstein-Uhlenbeck process.
Abstract
This paper studies the high-dimensional scaling limits of online stochastic gradient descent (SGD). Building on the recent work of Ben Arous, Gheissari, and Jagannath on the effective dynamics of SGD, we study the critical scaling regime of the step size for single-layer networks. Below this critical regime, the effective dynamics are governed by deterministic (ballistic) limits, whereas at the critical scale, a new correction term emerges that changes the phase diagram. In this regime, near fixed points, the corresponding diffusive (SDE) limits of the effective dynamics reduce to an Ornstein-Uhlenbeck process under certain conditions. These results highlight how the information exponent controls sample complexity and illustrate the limitations of deterministic scaling limits in capturing stochastic fluctuations in high-dimensional learning dynamics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
