Revisiting the central limit theorems for the SGD-type methods

Tiejun Li; Tiannan Xiao; Guoguo Yang

arXiv:2207.11755·math.OC·June 12, 2023·1 cites

Revisiting the central limit theorems for the SGD-type methods

Tiejun Li, Tiannan Xiao, Guoguo Yang

PDF

Open Access

TL;DR

This paper extends the central limit theorem for various SGD methods under more general conditions, analyzing both linear and nonlinear cases, and verifies findings through numerical experiments.

Contribution

It provides a more general CLT for SGD-type methods using Lyapunov functions and $L^p$ bounds, covering broader learning rate conditions.

Findings

01

CLT holds for linear cases in time averages

02

CLT does not generally hold for nonlinear cases

03

Numerical tests confirm theoretical results

Abstract

We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and $L^{p}$ bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods compared with previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Stochastic processes and financial applications · Statistical Methods and Inference