The Power of Normalization: Faster Evasion of Saddle Points

Kfir Y. Levy

arXiv:1611.04831·cs.LG·November 22, 2016·67 cites

The Power of Normalization: Faster Evasion of Saddle Points

Kfir Y. Levy

PDF

Open Access

TL;DR

This paper demonstrates that normalized gradient descent, with proper parameters and noise, can efficiently escape saddle points and converge to local minima faster than existing methods, with applications to tensor decomposition.

Contribution

The paper provides a theoretical analysis showing NGD's ability to evade saddle points and improve convergence rates, supported by practical application to online tensor decomposition.

Findings

01

NGD can provably escape saddle points with appropriate parameters.

02

NGD achieves faster convergence rates than previous first-order algorithms.

03

Application to tensor decomposition shows practical effectiveness.

Abstract

A commonly used heuristic in non-convex optimization is Normalized Gradient Descent (NGD) - a variant of gradient descent in which only the direction of the gradient is taken into account and its magnitude ignored. We analyze this heuristic and show that with carefully chosen parameters and noise injection, this method can provably evade saddle points. We establish the convergence of NGD to a local minimum, and demonstrate rates which improve upon the fastest known first order algorithm due to Ge e al. (2015). The effectiveness of our method is demonstrated via an application to the problem of online tensor decomposition; a task for which saddle point evasion is known to result in convergence to global minima.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques