Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems

Matteo Lapucci; Davide Pucci

arXiv:2411.07102·math.OC·March 13, 2026

Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems

Matteo Lapucci, Davide Pucci

PDF

1 Repo

TL;DR

This paper introduces a novel optimization algorithm that combines momentum, data persistency, and stochastic line searches, achieving state-of-the-art results in large-scale deep learning tasks.

Contribution

It proposes a new framework integrating momentum and line search techniques with data persistency, with proven convergence and superior empirical performance.

Findings

01

Outperforms existing methods in large-scale deep learning tasks

02

Achieves state-of-the-art results in convex and nonconvex problems

03

Provides convergence guarantees under certain assumptions

Abstract

In this work, we address unconstrained finite-sum optimization problems, with particular focus on instances originating in large scale deep learning scenarios. Our main interest lies in the exploration of the relationship between recent line search approaches for stochastic optimization in the overparametrized regime and momentum directions. First, we point out that combining these two elements with computational benefits is not straightforward. To this aim, we propose a solution based on mini-batch persistency. We then introduce an algorithmic framework that exploits a mix of data persistency, conjugate-gradient type rules for the definition of the momentum parameter and stochastic line searches. The resulting algorithm provably possesses convergence properties under suitable assumptions and is empirically shown to outperform other popular methods from the literature, obtaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dadopuccio/mb-conjugate-gradient-dp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus