New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results

Francesco Orabona; Ryan D'Orazio

arXiv:2505.20219·math.OC·January 22, 2026

New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results

Francesco Orabona, Ryan D'Orazio

PDF

Open Access

TL;DR

This paper offers a unified perspective on the Polyak stepsize by viewing it as gradient descent on a surrogate loss, clarifying its convergence properties and limitations across various variants.

Contribution

It introduces a simple, unified framework for analyzing Polyak stepsize variants as surrogate loss minimization, revealing their convergence behavior and limitations.

Findings

01

Unified analysis of Polyak variants across assumptions

02

Negative results confirming non-convergence in some cases

03

Insight into local curvature adaptation of stepsizes

Abstract

The Polyak stepsize has been proven to be a fundamental stepsize in convex optimization, giving near optimal gradient descent rates across a wide range of assumptions. The universality of the Polyak stepsize has also inspired many stochastic variants, with theoretical guarantees and strong empirical performance. Despite the many theoretical results, our understanding of the convergence properties and shortcomings of the Polyak stepsize or its variants is both incomplete and fractured across different analyses. We propose a new, unified, and simple perspective for the Polyak stepsize and its variants as gradient descent on a surrogate loss. We show that each variant is equivalent to minimize a surrogate function with stepsizes that adapt to a guaranteed local curvature. Our general surrogate loss perspective is then used to provide a unified analysis of existing variants across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Mathematical Approximation and Integration · Probability and Risk Models