Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects
Ke Liang Xiao, Noah Marshall, Atish Agarwala, Elliot Paquette

TL;DR
This paper provides a theoretical analysis of signSGD in high dimensions, deriving exact risk curves and quantifying its effects like preconditioning and noise compression, which enhances understanding of its optimization dynamics.
Contribution
It introduces a limiting SDE and ODE framework for signSGD, quantifies four key effects, and links these to data and noise distributions, advancing theoretical understanding.
Findings
Quantifies effective learning rate and noise compression effects.
Derives a limiting SDE and ODE for signSGD in high dimensions.
Provides insights into how signSGD reshapes noise and preconditions optimization.
Abstract
In recent years, signSGD has garnered interest as both a practical optimizer as well as a simple model to understand adaptive optimizers like Adam. Though there is a general consensus that signSGD acts to precondition optimization and reshapes noise, quantitatively understanding these effects in theoretically solvable settings remains difficult. We present an analysis of signSGD in a high dimensional limit, and derive a limiting SDE and ODE to describe the risk. Using this framework we quantify four effects of signSGD: effective learning rate, noise compression, diagonal preconditioning, and gradient noise reshaping. Our analysis is consistent with experimental observations but moves beyond that by quantifying the dependence of these effects on the data and noise distributions. We conclude with a conjecture on how these results might be extended to Adam.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Infrastructure Maintenance and Monitoring · Rock Mechanics and Modeling
MethodsAdam
