On the Convergence of AdaGrad(Norm) on $\R^{d}$: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen

TL;DR
This paper provides a comprehensive analysis of AdaGrad's convergence properties in unconstrained smooth convex and quasar convex optimization, introducing new techniques, variants, and accelerated algorithms with explicit rates.
Contribution
It develops explicit convergence bounds for AdaGrad in unconstrained settings, proposes a variant with last-iterate convergence, and introduces accelerated adaptive algorithms with improved rates.
Findings
Explicit convergence bounds for AdaGrad in unconstrained problems
A variant of AdaGrad with last-iterate convergence guarantees
Accelerated adaptive algorithms with improved deterministic rates
Abstract
Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. Furthermore, in the stochastic setting, only a modified version of AdaGrad, different from the one commonly used in practice, in which the latest gradient is not used to update the stepsize, has been analyzed. Our paper aims at bridging these gaps and developing a deeper understanding of AdaGrad and its variants in the standard setting of smooth convex functions as well as the more general setting of quasar convex functions. First, we demonstrate new techniques to explicitly bound the convergence rate of the vanilla AdaGrad for unconstrained problems in both deterministic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Optimization and Variational Analysis
MethodsAdaGrad
