Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step

Yair Carmon; John C. Duchi

arXiv:1612.00547·math.OC·August 31, 2022

Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step

Yair Carmon, John C. Duchi

PDF

Open Access

TL;DR

This paper proves that gradient descent can efficiently find near-global minima in non-convex cubic-regularized quadratic problems and approximate second-order stationary points in general non-convex functions.

Contribution

It establishes convergence rates for gradient descent in cubic-regularized non-convex problems, showing efficiency despite multiple saddle points and poor local minima.

Findings

01

Gradient descent approximates the global minimum within ε in O(ε^{-1} log(1/ε)) steps.

02

Gradient descent converges to second-order stationary points in general non-convex functions.

03

Logarithmic dependence on problem dimension in convergence rates.

Abstract

We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the $global minimum$ to within $ε$ accuracy in $O (ε^{- 1} lo g (1/ ε))$ steps for large $ε$ and $O (lo g (1/ ε))$ steps for small $ε$ (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAortic aneurysm repair treatments · Advanced Optimization Algorithms Research · Optimization and Variational Analysis