Robustly Stable Accelerated Momentum Methods With A Near-Optimal L2 Gain and $H_\infty$ Performance
Mert Gurbuzbalaban

TL;DR
This paper analyzes the robustness and convergence of a broad class of momentum-based optimization algorithms under worst-case gradient errors, providing explicit measures of stability and robustness related to $H_$ norms.
Contribution
It introduces a unified framework to evaluate the robustness of momentum methods using $H_$ gain analysis, encompassing several classical algorithms as special cases.
Findings
Heavy-ball method is less robust than Nesterov's accelerated gradient.
Explicit formulas for worst-case gradient errors for quadratic objectives.
Comparison of stability radii among different momentum methods.
Abstract
We consider the problem of minimizing a strongly convex smooth function where the gradients are subject to additive worst-case deterministic errors that are square-summable. We study the trade-offs between the convergence rate and robustness to gradient errors when designing the parameters of a first-order algorithm. We focus on a general class of momentum methods (GMM) with constant stepsize and momentum parameters which can recover gradient descent, Nesterov's accelerated gradient, the heavy-ball and the triple momentum methods as special cases. We measure the robustness of an algorithm in terms of the cumulative suboptimality over the iterations divided by the norm of the gradient errors, which can be interpreted as the minimal (induced) gain of a transformed dynamical system that represents the GMM iterations where the input is the gradient error sequence and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Numerical methods in inverse problems
