Accelerated Methods for Non-Convex Optimization

Yair Carmon; John C. Duchi; Oliver Hinder; Aaron Sidford

arXiv:1611.00756·math.OC·February 3, 2017·55 cites

Accelerated Methods for Non-Convex Optimization

Yair Carmon, John C. Duchi, Oliver Hinder, Aaron Sidford

PDF

Open Access

TL;DR

This paper introduces a Hessian-free accelerated gradient method for non-convex optimization that achieves faster convergence to stationary points with second-order guarantees, suitable for large-scale problems.

Contribution

It proposes a novel accelerated method with improved complexity for non-convex optimization, requiring only gradient evaluations and providing second-order guarantees.

Findings

01

Achieves $O( ext{epsilon}^{-7/4} ext{log}(1/ ext{epsilon}))$ complexity

02

Provides second-order guarantees with Hessian-free approach

03

Suitable for large-scale non-convex problems

Abstract

We present an accelerated gradient method for non-convex optimization problems with Lipschitz continuous first and second derivatives. The method requires time $O (ϵ^{- 7/4} lo g (1/ ϵ))$ to find an $ϵ$ -stationary point, meaning a point $x$ such that $∥\nabla f (x) ∥ \leq ϵ$ . The method improves upon the $O (ϵ^{- 2})$ complexity of gradient descent and provides the additional second-order guarantee that $\nabla^{2} f (x) ⪰ - O (ϵ^{1/2}) I$ for the computed $x$ . Furthermore, our method is Hessian free, i.e. it only requires gradient computations, and is therefore suitable for large scale applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research