Gradient Methods with Online Scaling Part I. Theoretical Foundations
Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, Madeleine Udell

TL;DR
This paper introduces the online scaled gradient methods (OSGM), a new framework that adaptively adjusts stepsizes using online learning, leading to accelerated convergence and explaining empirical heuristics in machine learning optimization.
Contribution
It establishes the theoretical foundations of OSGM, demonstrating its convergence guarantees and superlinear rates, and connects it to practical hypergradient-descent heuristics.
Findings
Achieves trajectory-dependent global convergence on smooth convex functions.
Provides improved complexity bounds for strongly convex problems.
Exhibits local superlinear convergence, similar to quasi-Newton methods.
Abstract
This paper establishes the theoretical foundations of the online scaled gradient methods (OSGM), a framework that utilizes online learning to adapt stepsizes and provably accelerate first-order methods. OSGM quantifies the effectiveness of a stepsize by a feedback function motivated from a convergence measure and uses the feedback to adjust the stepsize through an online learning algorithm. Consequently, instantiations of OSGM achieve convergence rates that are asymptotically no worse than the optimal stepsize. OSGM yields desirable convergence guarantees on smooth convex problems, including 1) trajectory-dependent global convergence on smooth convex objectives; 2) an improved complexity result on smooth strongly convex problems, and 3) local superlinear convergence. Notably, OSGM constitutes a new family of first-order methods with non-asymptotic superlinear convergence, joining the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research · Advanced Bandit Algorithms Research
