Linear Convergence of Generalized Mirror Descent with Time-Dependent   Mirrors

Adityanarayanan Radhakrishnan; Mikhail Belkin; Caroline Uhler

arXiv:2009.08574·cs.LG·October 7, 2021·1 cites

Linear Convergence of Generalized Mirror Descent with Time-Dependent Mirrors

Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

PDF

Open Access

TL;DR

This paper extends Polyak-Lojasiewicz inequality-based analysis to generalized mirror descent, including stochastic variants like Adagrad, establishing conditions for their linear convergence in non-convex optimization.

Contribution

It introduces a PL-based analysis for generalized mirror descent with time-dependent mirrors, covering stochastic versions and providing convergence conditions.

Findings

01

Established linear convergence conditions for stochastic GMD.

02

Provided learning rates for stochastic mirror descent and Adagrad.

03

Proved convergence of GMD to interpolating solutions for locally PL* functions.

Abstract

The Polyak-Lojasiewicz (PL) inequality is a sufficient condition for establishing linear convergence of gradient descent, even in non-convex settings. While several recent works use a PL-based analysis to establish linear convergence of stochastic gradient descent methods, the question remains as to whether a similar analysis can be conducted for more general optimization methods. In this work, we present a PL-based analysis for linear convergence of generalized mirror descent (GMD), a generalization of mirror descent with a possibly time-dependent mirror. GMD subsumes popular first order optimization methods including gradient descent, mirror descent, and preconditioned gradient descent methods such as Adagrad. Since the standard PL analysis cannot be extended naturally from GMD to stochastic GMD, we present a Taylor-series based analysis to establish sufficient conditions for linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Numerical methods in inverse problems