A new regret analysis for Adam-type algorithms

Ahmet Alacaoglu; Yura Malitsky; Panayotis Mertikopoulos; Volkan Cevher

arXiv:2003.09729·stat.ML·March 24, 2020·5 cites

A new regret analysis for Adam-type algorithms

Ahmet Alacaoglu, Yura Malitsky, Panayotis Mertikopoulos, Volkan Cevher

PDF

Open Access 1 Video

TL;DR

This paper introduces a new theoretical framework that bridges the gap between practical usage and theoretical guarantees of Adam-type algorithms by enabling optimal regret bounds with constant momentum parameters.

Contribution

The authors develop a novel analysis method that provides regret guarantees for Adam variants with fixed momentum parameters, aligning theory with practical implementations.

Findings

01

Achieves data-dependent regret bounds with constant $eta_1$

02

Applies to a wide range of Adam-type algorithms

03

Removes the need for decaying $eta_1$ in theoretical analysis

Abstract

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $β_{1}$ (typically between $0.9$ and $0.99$ ). In theory, regret guarantees for online convex optimization require a rapidly decaying $β_{1} \to 0$ schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $β_{1}$ , without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A new regret analysis for Adam-type algorithms· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques

MethodsAdam