Incorporating Preconditioning into Accelerated Approaches: Theoretical Guarantees and Practical Improvement

Stepan Trifonov; Leonid Levin; Savelii Chezhegov; Aleksandr Beznosikov

arXiv:2505.23510·math.OC·October 1, 2025

Incorporating Preconditioning into Accelerated Approaches: Theoretical Guarantees and Practical Improvement

Stepan Trifonov, Leonid Levin, Savelii Chezhegov, Aleksandr Beznosikov

PDF

TL;DR

This paper introduces preconditioned accelerated optimization methods, providing theoretical convergence guarantees and demonstrating improved practical performance over traditional techniques in handling poorly conditioned problems.

Contribution

It proposes the Preconditioned Heavy Ball and Preconditioned Nesterov methods with unified convergence guarantees, integrating preconditioning with momentum acceleration.

Findings

01

Proposed methods have superior convergence properties.

02

Numerical experiments show improved iteration efficiency.

03

Methods outperform unscaled techniques in practice.

Abstract

Machine learning and deep learning are widely researched fields that provide solutions to many modern problems. Due to the complexity of new problems related to the size of datasets, efficient approaches are obligatory. In optimization theory, the Heavy Ball and Nesterov methods use \textit{momentum} in their updates of model weights. On the other hand, the minimization problems considered may be poorly conditioned, which affects the applicability and effectiveness of the aforementioned techniques. One solution to this issue is \textit{preconditioning}, which has already been investigated in approaches such as \textsc{AdaGrad}, \textsc{RMSProp}, \textsc{Adam} and others. Despite this, momentum acceleration and preconditioning have not been fully explored together. Therefore, we propose the Preconditioned Heavy Ball (\textsc{PHB}) and Preconditioned Nesterov method (\textsc{PN}) with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.