A convergence study of SGD-type methods for stochastic optimization

Tiannan Xiao; Guoguo Yang

arXiv:2211.06197·math.OC·June 12, 2023

A convergence study of SGD-type methods for stochastic optimization

Tiannan Xiao, Guoguo Yang

PDF

Open Access

TL;DR

This paper provides a comprehensive convergence analysis of various SGD-type methods, including vanilla, momentum, and Nesterov accelerated SGD, under broader conditions for convex and non-convex stochastic optimization problems.

Contribution

It extends convergence results for SGD variants to more general learning rates and convex assumptions, using Lyapunov functions for non-convex analysis.

Findings

01

Convergence of vanilla SGD under relaxed learning rate conditions.

02

Convergence of momentum and Nesterov accelerated SGD for convex and non-convex problems.

03

Analysis of time-averaged SGD convergence.

Abstract

In this paper, we first reinvestigate the convergence of vanilla SGD method in the sense of $L^{2}$ under more general learning rates conditions and a more general convex assumption, which relieves the conditions on learning rates and do not need the problem to be strongly convex. Then, by taking advantage of the Lyapunov function technique, we present the convergence of the momentum SGD and Nesterov accelerated SGD methods for the convex and non-convex problem under $L$ -smooth assumption that extends the bounded gradient limitation to a certain extent. The convergence of time averaged SGD was also analyzed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference