Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step
Yair Carmon, John C. Duchi

TL;DR
This paper proves that gradient descent can efficiently find near-global minima in non-convex cubic-regularized quadratic problems and approximate second-order stationary points in general non-convex functions.
Contribution
It establishes convergence rates for gradient descent in cubic-regularized non-convex problems, showing efficiency despite multiple saddle points and poor local minima.
Findings
Gradient descent approximates the global minimum within ε in O(ε^{-1} log(1/ε)) steps.
Gradient descent converges to second-order stationary points in general non-convex functions.
Logarithmic dependence on problem dimension in convergence rates.
Abstract
We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the to within accuracy in steps for large and steps for small (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAortic aneurysm repair treatments · Advanced Optimization Algorithms Research · Optimization and Variational Analysis
