A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu, Yura Malitsky, Panayotis Mertikopoulos, Volkan Cevher

TL;DR
This paper introduces a new theoretical framework that bridges the gap between practical usage and theoretical guarantees of Adam-type algorithms by enabling optimal regret bounds with constant momentum parameters.
Contribution
The authors develop a novel analysis method that provides regret guarantees for Adam variants with fixed momentum parameters, aligning theory with practical implementations.
Findings
Achieves data-dependent regret bounds with constant $eta_1$
Applies to a wide range of Adam-type algorithms
Removes the need for decaying $eta_1$ in theoretical analysis
Abstract
In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter (typically between and ). In theory, regret guarantees for online convex optimization require a rapidly decaying schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant , without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
MethodsAdam
