Efficient approaches for escaping higher order saddle points in   non-convex optimization

Anima Anandkumar; Rong Ge

arXiv:1602.05908·cs.LG·February 19, 2016·49 cites

Efficient approaches for escaping higher order saddle points in non-convex optimization

Anima Anandkumar, Rong Ge

PDF

Open Access

TL;DR

This paper introduces an efficient algorithm that guarantees convergence to third order local optima in non-convex optimization, addressing the challenge of escaping complex saddle points using higher order derivatives.

Contribution

The paper presents the first efficient method for escaping higher order saddle points by leveraging third order derivatives, surpassing previous second order techniques.

Findings

01

Algorithm guarantees convergence to third order local optima.

02

It is NP-hard to extend to fourth order local optima.

03

Addresses degenerate saddle points in high-dimensional non-convex functions.

Abstract

Local search heuristics for non-convex optimizations are popular in applied machine learning. However, in general it is hard to guarantee that such algorithms even converge to a local minimum, due to the existence of complicated saddle point structures in high dimensions. Many functions have degenerate saddle points such that the first and second order derivatives cannot distinguish them with local optima. In this paper we use higher order derivatives to escape these saddle points: we design the first efficient algorithm guaranteed to converge to a third order local optimum (while existing techniques are at most second order). We also show that it is NP-hard to extend this further to finding fourth order local optima.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms