Polyak Stepsize: Estimating Optimal Functional Values Without Parameters or Prior Knowledge

Farshed Abdukhakimov; Cuong Anh Pham; Samuel Horv\'ath; Martin Tak\'a\v{c}; Slavom{\i}r Hanzely

arXiv:2508.17288·math.OC·August 26, 2025

Polyak Stepsize: Estimating Optimal Functional Values Without Parameters or Prior Knowledge

Farshed Abdukhakimov, Cuong Anh Pham, Samuel Horv\'ath, Martin Tak\'a\v{c}, Slavom{\i}r Hanzely

PDF

TL;DR

This paper introduces a parameter-free method for gradient descent that estimates the optimal functional value on-the-fly, removing the need for prior knowledge and maintaining fast convergence.

Contribution

A novel parameter-free Polyak stepsize approach that adaptively estimates the optimal value during optimization, eliminating the need for prior information.

Findings

01

Achieves competitive convergence without prior knowledge

02

Maintains fast convergence similar to traditional Polyak stepsize

03

Validated through numerical experiments showing practical effectiveness

Abstract

The Polyak stepsize for Gradient Descent is known for its fast convergence but requires prior knowledge of the optimal functional value, which is often unavailable in practice. In this paper, we propose a parameter-free approach that estimates this unknown value during the algorithm's execution, enabling a parameter-free stepsize schedule. Our method maintains two sequences of iterates: one with a higher functional value is updated using the Polyak stepsize, and the other one with a lower functional value is used as an estimate of the optimal functional value. We provide a theoretical analysis of the approach and validate its performance through numerical experiments. The results demonstrate that our method achieves competitive performance without relying on prior function-dependent information.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.