Exact worst-case convergence rates of gradient descent: a complete analysis for all constant stepsizes over nonconvex and convex functions

Teodor Rotaru; Fran\c{c}ois Glineur; Panagiotis Patrinos

arXiv:2406.17506·math.OC·January 23, 2026

Exact worst-case convergence rates of gradient descent: a complete analysis for all constant stepsizes over nonconvex and convex functions

Teodor Rotaru, Fran\c{c}ois Glineur, Panagiotis Patrinos

PDF

Open Access

TL;DR

This paper provides a comprehensive analysis of the exact worst-case convergence rates of gradient descent with all constant stepsizes across convex, nonconvex, and weakly convex functions, including new optimal stepsize and variant algorithms.

Contribution

It offers the first complete, exact worst-case convergence analysis for gradient descent with any constant stepsize on all function types, including non-Lipschitz cases, and introduces a superior variable stepsize method.

Findings

01

Exact worst-case convergence rates derived for all constant stepsizes.

02

Identification of the optimal constant stepsize for gradient descent.

03

Introduction of a new variable stepsize gradient descent variant with improved worst-case performance.

Abstract

We consider gradient descent with constant stepsizes and derive exact worst-case convergence rates on the minimum gradient norm of the iterates. Our analysis covers all possible stepsizes and arbitrary upper/lower bounds on the curvature of the objective function, thus including convex, strongly convex and weakly convex (hypoconvex) objective functions. Among the challenging parts of the analysis, we note the necessity to exploit dependencies between non-consecutive iterates. While this complicates the proofs to some extent, it enables us to achieve an exact full-range analysis of gradient descent for any constant stepsize (covering, in particular, normalized stepsizes greater than one), whereas the literature contained only conjectured rates of this type. In the nonconvex case, allowing arbitrary bounds on upper and lower curvatures extends existing partial results that are valid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Mathematical Biology Tumor Growth