Beyond Convexity -- Contraction and Global Convergence of Gradient   Descent

Patrick M. Wensing; Jean-Jacques E. Slotine

arXiv:1806.06655·math.OC·December 23, 2022

Beyond Convexity -- Contraction and Global Convergence of Gradient Descent

Patrick M. Wensing, Jean-Jacques E. Slotine

PDF

TL;DR

This paper introduces a contraction theory framework for analyzing gradient descent, extending classical convexity results to more general, possibly non-convex, settings including Riemannian manifolds and time-varying problems.

Contribution

It generalizes convergence analysis of gradient descent using contraction theory, applicable to geodesically convex and non-convex problems, and extends to primal-dual and game-theoretic optimization.

Findings

01

Gradient descent converges to a unique equilibrium if contracting in any metric.

02

Contraction analysis applies to geodesically convex optimization on Riemannian manifolds.

03

Semi-contraction provides insights into the topology of multiple optima.

Abstract

This paper considers the analysis of continuous time gradient-based optimization algorithms through the lens of nonlinear contraction theory. It demonstrates that in the case of a time-invariant objective, most elementary results on gradient descent based on convexity can be replaced by much more general results based on contraction. In particular, gradient descent converges to a unique equilibrium if its dynamics are contracting in any metric, with convexity of the cost corresponding to the special case of contraction in the identity metric. More broadly, contraction analysis provides new insights for the case of geodesically-convex optimization, wherein non-convex problems in Euclidean space can be transformed to convex ones posed over a Riemannian manifold. In this case, natural gradient descent converges to a unique equilibrium if it is contracting in any metric, with geodesic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.