Convergence Analysis of Accelerated Stochastic Gradient Descent under the Growth Condition
You-Lin Chen, Sen Na, Mladen Kolar

TL;DR
This paper analyzes how various accelerated stochastic gradient methods converge under the growth condition, revealing their robustness to multiplicative noise and proposing a scheme to enhance their convergence rates.
Contribution
It provides a comprehensive convergence analysis of multiple accelerated methods under the growth condition and introduces a tail-averaged scheme to improve their convergence rates.
Findings
All methods converge to a neighborhood of the optimum with accelerated rates.
NAM, RMM, iDAM+ are sensitive to mild multiplicative noise.
DAM+ maintains acceleration even with large multiplicative noise.
Abstract
We study the convergence of accelerated stochastic gradient descent for strongly convex objectives under the growth condition, which states that the variance of stochastic gradient is bounded by a multiplicative part that grows with the full gradient, and a constant additive part. Through the lens of the growth condition, we investigate four widely used accelerated methods: Nesterov's accelerated method (NAM), robust momentum method (RMM), accelerated dual averaging method (DAM+), and implicit DAM+ (iDAM+). While these methods are known to improve the convergence rate of SGD under the condition that the stochastic gradient has bounded variance, it is not well understood how their convergence rates are affected by the multiplicative noise. In this paper, we show that these methods all converge to a neighborhood of the optimum with accelerated convergence rates (compared to SGD) even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
