Revisiting the central limit theorems for the SGD-type methods
Tiejun Li, Tiannan Xiao, Guoguo Yang

TL;DR
This paper extends the central limit theorem for various SGD methods under more general conditions, analyzing both linear and nonlinear cases, and verifies findings through numerical experiments.
Contribution
It provides a more general CLT for SGD-type methods using Lyapunov functions and $L^p$ bounds, covering broader learning rate conditions.
Findings
CLT holds for linear cases in time averages
CLT does not generally hold for nonlinear cases
Numerical tests confirm theoretical results
Abstract
We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods compared with previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Stochastic processes and financial applications · Statistical Methods and Inference
