Accelerated Gradient Descent via Long Steps

Benjamin Grimmer; Kevin Shu; Alex L. Wang

arXiv:2309.09961·math.OC·September 28, 2023

Accelerated Gradient Descent via Long Steps

Benjamin Grimmer, Kevin Shu, Alex L. Wang

PDF

Open Access 1 Repo

TL;DR

This paper proves the first accelerated convergence rate for gradient descent in smooth convex optimization by using a nonconstant sequence of increasing step sizes, surpassing the traditional $O(1/T)$ rate.

Contribution

It establishes a new $O(1/T^{1.0564})$ convergence rate for gradient descent with long, nonperiodic steps, advancing the understanding of acceleration in convex optimization.

Findings

01

Proves a $O(1/T^{1.0564})$ convergence rate for smooth convex minimization.

02

Shows that long, increasing step sizes can accelerate gradient descent.

03

Extends the theory to strongly convex optimization with similar acceleration results.

Abstract

Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps periodically, gradient descent's textbook $L D^{2} /2 T$ convergence guarantees can be improved by constant factors, conjecturing an accelerated rate strictly faster than $O (1/ T)$ could be possible. Here we prove such a big-O gain, establishing gradient descent's first accelerated convergence rate in this setting. Namely, we prove a $O (1/ T^{1.0564})$ rate for smooth convex minimization by utilizing a nonconstant nonperiodic sequence of increasingly large stepsizes. It remains open if one can achieve the $O (1/ T^{1.178})$ rate conjectured by Das Gupta et. al. [2] or the optimal gradient method rate of $O (1/ T^{2})$ . Big-O convergence rate accelerations from long steps follow from our theory for strongly convex optimization, similar to but somewhat weaker than those concurrently developed by Altschuler and Parrilo…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ootks/gdlongsteps
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research