Last-Iterate Complexity of SGD for Convex and Smooth Stochastic Problems
Guillaume Garrigos, Daniel Cortild, Lucas Ketels, Juan Peypouquet

TL;DR
This paper proves that stochastic gradient descent (SGD) achieves an optimal last-iterate convergence rate of approximately T^{-1/2} for convex smooth stochastic problems without requiring the assumption of bounded gradient variance.
Contribution
It establishes the first last-iterate convergence rate for SGD in convex smooth stochastic problems without assuming bounded gradient variance.
Findings
SGD attains a O(T^{-1/2}) last-iterate convergence rate.
No need for unverifiable bounded variance assumption.
Results improve understanding of SGD's behavior in convex optimization.
Abstract
Most results on Stochastic Gradient Descent (SGD) in the convex and smooth setting are presented under the form of bounds on the ergodic function value gap. It is an open question whether bounds can be derived directly on the last iterate of SGD in this context. Recent advances suggest that it should be possible. For instance, it can be achieved by making the additional, yet unverifiable, assumption that the variance of the stochastic gradients is uniformly bounded. In this paper, we show that there is no need of such an assumption, and that SGD enjoys a last-iterate complexity rate for convex smooth stochastic problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
