Polyak Stepsize: Estimating Optimal Functional Values Without Parameters or Prior Knowledge
Farshed Abdukhakimov, Cuong Anh Pham, Samuel Horv\'ath, Martin Tak\'a\v{c}, Slavom{\i}r Hanzely

TL;DR
This paper introduces a parameter-free method for gradient descent that estimates the optimal functional value on-the-fly, removing the need for prior knowledge and maintaining fast convergence.
Contribution
A novel parameter-free Polyak stepsize approach that adaptively estimates the optimal value during optimization, eliminating the need for prior information.
Findings
Achieves competitive convergence without prior knowledge
Maintains fast convergence similar to traditional Polyak stepsize
Validated through numerical experiments showing practical effectiveness
Abstract
The Polyak stepsize for Gradient Descent is known for its fast convergence but requires prior knowledge of the optimal functional value, which is often unavailable in practice. In this paper, we propose a parameter-free approach that estimates this unknown value during the algorithm's execution, enabling a parameter-free stepsize schedule. Our method maintains two sequences of iterates: one with a higher functional value is updated using the Polyak stepsize, and the other one with a lower functional value is used as an estimate of the optimal functional value. We provide a theoretical analysis of the approach and validate its performance through numerical experiments. The results demonstrate that our method achieves competitive performance without relying on prior function-dependent information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
