A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for   Gradient Descent

Benjamin Grimmer; Kevin Shu; and Alex L. Wang

arXiv:2407.11739·math.OC·July 17, 2024

A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent

Benjamin Grimmer, Kevin Shu, and Alex L. Wang

PDF

Open Access

TL;DR

This paper strengthens a conjecture about the optimal stepsize for gradient descent, proposing a specific low-rank certificate that enables verification of the conjecture for much larger iteration counts.

Contribution

It introduces a low-rank certificate structure that bypasses SDPs, allowing verification of the conjecture up to 20,160 iterations.

Findings

01

Verification of the conjecture up to N=20160 iterations

02

Proposal of a low-rank certificate structure

03

Enhanced understanding of optimal stepsize for gradient descent

Abstract

Drori and Teboulle [4] conjectured that the minimax optimal constant stepsize for N steps of gradient descent is given by the stepsize that balances performance on Huber and quadratic objective functions. This was numerically supported by semidefinite program (SDP) solves of the associated performance estimation problems up to $N \approx 100$ . This note presents a strengthened version of the initial conjecture. Specifically, we conjecture the existence of a certificate for the convergence rate with a very specific low-rank structure. This structure allows us to bypass SDPs and to numerically verify both conjectures up to $N = 20160$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPoint processes and geometric inequalities