Error estimates between SGD with momentum and underdamped Langevin diffusion
Arnaud Guillin (LMBP), Yu Wang, Lihu Xu, Haoran Yang

TL;DR
This paper provides a quantitative analysis of the error between stochastic gradient descent with momentum and underdamped Langevin diffusion, highlighting their close relationship in probabilistic metrics.
Contribution
It establishes a novel error estimate between SGD with momentum and underdamped Langevin diffusion in Wasserstein and total variation distances.
Findings
Error bounds in 1-Wasserstein distance
Error bounds in total variation distance
Quantitative comparison of SGD with momentum and Langevin diffusion
Abstract
Stochastic gradient descent with momentum is a popular variant of stochastic gradient descent, which has recently been reported to have a close relationship with the underdamped Langevin diffusion. In this paper, we establish a quantitative error estimate between them in the 1-Wasserstein and total variation distances.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvancements in Semiconductor Devices and Circuit Design · Atomic and Subatomic Physics Research · stochastic dynamics and bifurcation
