Robust Accelerated Gradient Methods for Smooth Strongly Convex Functions

Necdet Serhat Aybat; Alireza Fallah; Mert Gurbuzbalaban; Asuman; Ozdaglar

arXiv:1805.10579·math.OC·November 7, 2019

Robust Accelerated Gradient Methods for Smooth Strongly Convex Functions

Necdet Serhat Aybat, Alireza Fallah, Mert Gurbuzbalaban, Asuman, Ozdaglar

PDF

TL;DR

This paper analyzes the trade-offs between convergence speed and robustness to gradient errors in accelerated gradient methods for strongly convex functions, providing exact and bounded characterizations and designing algorithms with improved noise resilience.

Contribution

It introduces a framework to quantify and optimize the robustness of accelerated gradient methods against random gradient errors, revealing their enhanced noise tolerance.

Findings

01

AG achieves acceleration with greater robustness to gradient noise.

02

Exact robustness expressions derived for quadratic functions.

03

Practical algorithms outperforming state-of-the-art under noise conditions.

Abstract

We study the trade-offs between convergence rate and robustness to gradient errors in designing a first-order algorithm. We focus on gradient descent (GD) and accelerated gradient (AG) methods for minimizing strongly convex functions when the gradient has random errors in the form of additive white noise. With gradient errors, the function values of the iterates need not converge to the optimal value; hence, we define the robustness of an algorithm to noise as the asymptotic expected suboptimality of the iterate sequence to input noise power. For this robustness measure, we provide exact expressions for the quadratic case using tools from robust control theory and tight upper bounds for the smooth strongly convex case using Lyapunov functions certified through matrix inequalities. We use these characterizations within an optimization problem which selects parameters of each algorithm to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.