Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and   Practical Performance

Dimitris Oikonomou; Nicolas Loizou

arXiv:2406.04142·math.OC·March 5, 2025

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance

Dimitris Oikonomou, Nicolas Loizou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces new adaptive Polyak step-size variants for stochastic heavy ball methods, providing convergence guarantees and demonstrating improved practical performance in large-scale stochastic optimization tasks.

Contribution

It proposes three novel Polyak-type step-size methods for SHB, with convergence guarantees and adaptability without prior parameter knowledge or interpolation assumptions.

Findings

01

Convergence to a neighborhood of the solution for convex problems.

02

Fast convergence to the true solution under interpolation.

03

Experimental validation showing robustness and effectiveness.

Abstract

Stochastic gradient descent with momentum, also known as Stochastic Heavy Ball method (SHB), is one of the most popular algorithms for solving large-scale stochastic optimization problems in various machine learning tasks. In practical scenarios, tuning the step-size and momentum parameters of the method is a prohibitively expensive and time-consuming process. In this work, inspired by the recent advantages of stochastic Polyak step-size in the performance of stochastic gradient descent (SGD), we propose and explore new Polyak-type variants suitable for the update rule of the SHB method. In particular, using the Iterate Moving Average (IMA) viewpoint of SHB, we propose and analyze three novel step-size selections: MomSPS $_{m a x}$ , MomDecSPS, and MomAdaSPS. For MomSPS $_{m a x}$ , we provide convergence guarantees for SHB to a neighborhood of the solution for convex and smooth problems…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dimitris-oik/momsps
pytorchOfficial

Videos

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance· slideslive

Taxonomy

TopicsStochastic processes and statistical mechanics · Markov Chains and Monte Carlo Methods · Random Matrices and Applications

MethodsStochastic Gradient Descent