Gradient Methods with Online Scaling

Wenzhi Gao; Ya-Chi Chu; Yinyu Ye; Madeleine Udell

arXiv:2411.01803·math.OC·November 7, 2024

Gradient Methods with Online Scaling

Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, Madeleine Udell

PDF

Open Access 1 Repo

TL;DR

This paper presents a framework that adaptively scales gradients using online learning to accelerate convergence in gradient-based optimization, achieving improved complexity bounds and superlinear convergence.

Contribution

It introduces a novel online learning-based gradient scaling framework that provably accelerates convergence and improves complexity bounds over previous methods.

Findings

01

Achieves $O( ext{condition number} imes ext{log}(1/\varepsilon))$ complexity for strongly convex problems.

02

Demonstrates superlinear convergence on convex quadratics.

03

Shows hypergradient descent improves convergence over standard gradient descent.

Abstract

We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically. In contrast with previous literature, where convergence is established based on worst-case analysis, our framework provides a strong convergence guarantee with respect to the optimal scaling matrix for the iteration trajectory. For smooth strongly convex optimization, our results provide an $O (κ^{⋆} lo g (1/ ε)$ ) complexity result, where $κ^{⋆}$ is the condition number achievable by the optimal preconditioner, improving on the previous $O (n κ^{⋆} lo g (1/ ε))$ result. In particular, a variant of our method achieves superlinear convergence on convex quadratics. For smooth convex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Gwzwpxz/osgm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques