Enhancing Optimizer Stability: Momentum Adaptation of The NGN Step-size
Rustem Islamov, Niccolo Ajroldi, Antonio Orvieto, Aurelien Lucchi

TL;DR
This paper introduces NGN-M, a momentum-based optimizer that adapts the NGN step-size to improve stability and robustness to hyperparameter tuning in deep learning, matching or surpassing state-of-the-art performance.
Contribution
It presents a novel momentum-based adaptation of the NGN step-size that achieves standard convergence rates under weaker assumptions and enhances optimizer stability.
Findings
Enhanced robustness to step-size hyperparameter choices.
Achieves standard convergence rate of O(1/√K) under weaker assumptions.
Performs comparably or better than existing optimizers.
Abstract
Modern optimization algorithms that incorporate momentum and adaptive step-size offer improved performance in numerous challenging deep learning tasks. However, their effectiveness is often highly sensitive to the choice of hyperparameters, especially the step-size. Tuning these parameters is often difficult, resource-intensive, and time-consuming. Therefore, recent efforts have been directed toward enhancing the stability of optimizers across a wide range of hyperparameter choices [Schaipp et al., 2024]. In this paper, we introduce an algorithm that matches the performance of state-of-the-art optimizers while improving stability to the choice of the step-size hyperparameter through a novel adaptation of the NGN step-size method [Orvieto and Xiao, 2024]. Specifically, we propose a momentum-based version (NGN-M) that attains the standard convergence rate of …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research
