Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations
Qianxiao Li, Cheng Tai, Weinan E

TL;DR
This paper establishes a rigorous mathematical framework using stochastic modified equations to analyze the dynamics of stochastic gradient algorithms, providing precise approximations and insights into their behavior.
Contribution
It introduces the stochastic modified equations framework for stochastic gradient algorithms, offering a rigorous weak approximation and analytical insights into their dynamics.
Findings
Provides a mathematical foundation for SME in stochastic gradient analysis
Demonstrates the approximation of SGD and momentum methods by stochastic differential equations
Uncovers analytical insights into stochastic algorithms through continuous-time modeling
Abstract
We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as an weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent (SGD), momentum SGD and stochastic Nesterov's accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover important analytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Stochastic Gradient Optimization Techniques · Mathematical Biology Tumor Growth
MethodsStochastic Gradient Descent
