Linear Regularizers Enforce the Strict Saddle Property
Matthew Ubl, Kasra Yazdani, Matthew T. Hale

TL;DR
This paper introduces a linear regularization technique that enforces the strict saddle property in non-convex functions, enabling gradient descent to reliably escape non-strict saddle points in machine learning optimization.
Contribution
The authors propose a local linear regularization method that guarantees escape from non-strict saddle points, addressing a gap in existing first-order optimization techniques.
Findings
Regularization with a linear term enforces the strict saddle property.
Gradient descent can escape neighborhoods of non-strict saddle points with the proposed regularizer.
The method is validated through numerical examples on common non-strict saddle points.
Abstract
Satisfaction of the strict saddle property has become a standard assumption in non-convex optimization, and it ensures that many first-order optimization algorithms will almost always escape saddle points. However, functions exist in machine learning that do not satisfy this property, such as the loss function of a neural network with at least two hidden layers. First-order methods such as gradient descent may converge to non-strict saddle points of such functions, and there do not currently exist any first-order methods that reliably escape non-strict saddle points. To address this need, we demonstrate that regularizing a function with a linear term enforces the strict saddle property, and we provide justification for only regularizing locally, i.e., when the norm of the gradient falls below a certain threshold. We analyze bifurcations that may result from this form of regularization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging
