A Newton-Based Method for Nonconvex Optimization with Fast Evasion of   Saddle Points

Santiago Paternain; Aryan Mokhtari; Alejandro Ribeiro

arXiv:1707.08028·math.OC·July 23, 2018·SIAM J. Optim.·1 cites

A Newton-Based Method for Nonconvex Optimization with Fast Evasion of Saddle Points

Santiago Paternain, Aryan Mokhtari, Alejandro Ribeiro

PDF

Open Access

TL;DR

This paper introduces a modified Newton-based method that efficiently escapes saddle points in nonconvex optimization problems, ensuring faster convergence to local minima in machine learning tasks.

Contribution

It proposes a novel second-order algorithm that replaces negative Hessian eigenvalues with their absolute values, enabling rapid saddle point escape.

Findings

01

Escapes saddle points in at most logarithmic iterations

02

Converges to a local minimum with high probability

03

Operates independently of problem-specific constants

Abstract

Machine learning problems such as neural network training, tensor decomposition, and matrix factorization, require local minimization of a nonconvex function. This local minimization is challenged by the presence of saddle points, of which there can be many and from which descent methods may take inordinately large number of iterations to escape. This paper presents a second-order method that modifies the update of Newton's method by replacing the negative eigenvalues of the Hessian by their absolute values and uses a truncated version of the resulting matrix to account for the objective's curvature. The method is shown to escape saddles in at most $1 + lo g_{3/2} (δ /2 ε)$ iterations where $ε$ is the target optimality and $δ$ characterizes a point sufficiently far away from the saddle. This base of this exponential escape is $3/2$ independently of problem…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research