Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition
Mark Schmidt (INRIA Paris - Rocquencourt, LIENS), Nicolas Le Roux, (INRIA Paris - Rocquencourt, LIENS)

TL;DR
This paper proves that stochastic gradient descent converges quickly under a strong growth condition, achieving an $O(1/k)$ rate generally and linear convergence for strongly convex functions, expanding understanding of its efficiency.
Contribution
It demonstrates that under a specific growth condition, stochastic gradient descent attains fast convergence rates, including linear convergence for strongly convex functions.
Findings
SGD has an $O(1/k)$ convergence rate under the growth condition.
Linear convergence is achieved if the function is strongly convex.
The results extend the theoretical understanding of SGD efficiency.
Abstract
We consider optimizing a function smooth convex function that is the average of a set of differentiable functions , under the assumption considered by Solodov [1998] and Tseng [1998] that the norm of each gradient is bounded by a linear function of the norm of the average gradient . We show that under these assumptions the basic stochastic gradient method with a sufficiently-small constant step-size has an convergence rate, and has a linear convergence rate if is strongly-convex.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs
