Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems

Parvin Nazari; Esmaile Khorram

arXiv:2209.01608·cs.LG·September 7, 2022

Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems

Parvin Nazari, Esmaile Khorram

PDF

Open Access

TL;DR

This paper analyzes the dynamic regret of an adaptive gradient method called M-ADAGRAD in strongly convex problems, providing bounds that account for environment non-stationarity and demonstrating practical effectiveness.

Contribution

It introduces a dynamic regret analysis for M-ADAGRAD, a variant of ADAGRAD, in strongly convex settings, with bounds based on environment non-stationarity and multiple gradient accesses.

Findings

01

Regret bounds depend on the path-length of minimizer sequence.

02

Exploiting multiple gradient accesses improves regret bounds.

03

Empirical results show M-ADAGRAD performs well in practice.

Abstract

Adaptive gradient algorithms such as ADAGRAD and its variants have gained popularity in the training of deep neural networks. While many works as for adaptive methods have focused on the static regret as a performance metric to achieve a good regret guarantee, the dynamic regret analyses of these methods remain unclear. As opposed to the static regret, dynamic regret is considered to be a stronger concept of performance measurement in the sense that it explicitly elucidates the non-stationarity of the environment. In this paper, we go through a variant of ADAGRAD (referred to as M-ADAGRAD ) in a strong convex setting via the notion of dynamic regret, which measures the performance of an online learner against a reference (optimal) solution that may change over time. We demonstrate a regret bound in terms of the path-length of the minimizer sequence that essentially reflects the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques

MethodsAdaGrad