Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Kang Chen; Yasong Feng; Tianyu Wang

arXiv:2508.20823·math.OC·March 19, 2026

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Kang Chen, Yasong Feng, Tianyu Wang

PDF

TL;DR

This paper establishes tight, uniform-in-time convergence bounds for stochastic gradient descent on strongly convex functions, improving understanding of its long-term behavior and extending results to related classes like Polyak-Łojasiewicz functions.

Contribution

It provides the first tight, uniform-in-time convergence bounds for SGD on strongly convex objectives, including an improved last-iterate rate and generalizations to broader function classes.

Findings

01

Convergence rate of order (log log k + log(1/β))/k with high probability

02

Bound is tight up to constant factors

03

Extension to Polyak-Łojasiewicz functions and contractive stochastic approximation

Abstract

Stochastic optimization via Stochastic Gradient Descent (SGD) is a fundamental problem in statistics and optimization. This paper revisits Stochastic Gradient Descent (SGD) for strongly convex objectives, establishing tight, uniform-in-time convergence bounds. We prove that, with probability at least $1 - β$ , a convergence rate of order $\frac{l o g l o g k + l o g ( 1/ β )}{k}$ simultaneously holds for all $k \in N_{+}$ , and demonstrate this bound is tight up to constant factors. We also provide an improved last-iterate convergence rate for such objectives. While focused on strongly convex objectives, our results generalize to the Polyak-{\L}ojasiewicz functions and indicate an $O (k^{- 1} lo g lo g k)$ convergence rate for contractive stochastic approximation with additive noise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.