Efficient approaches for escaping higher order saddle points in non-convex optimization
Anima Anandkumar, Rong Ge

TL;DR
This paper introduces an efficient algorithm that guarantees convergence to third order local optima in non-convex optimization, addressing the challenge of escaping complex saddle points using higher order derivatives.
Contribution
The paper presents the first efficient method for escaping higher order saddle points by leveraging third order derivatives, surpassing previous second order techniques.
Findings
Algorithm guarantees convergence to third order local optima.
It is NP-hard to extend to fourth order local optima.
Addresses degenerate saddle points in high-dimensional non-convex functions.
Abstract
Local search heuristics for non-convex optimizations are popular in applied machine learning. However, in general it is hard to guarantee that such algorithms even converge to a local minimum, due to the existence of complicated saddle point structures in high dimensions. Many functions have degenerate saddle points such that the first and second order derivatives cannot distinguish them with local optima. In this paper we use higher order derivatives to escape these saddle points: we design the first efficient algorithm guaranteed to converge to a third order local optimum (while existing techniques are at most second order). We also show that it is NP-hard to extend this further to finding fourth order local optima.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
