Gradient descent avoids strict saddles with a simple line-search method too

Andreea-Alexandra Mu\c{s}at; Nicolas Boumal

arXiv:2507.13804·math.OC·December 17, 2025

Gradient descent avoids strict saddles with a simple line-search method too

Andreea-Alexandra Mu\c{s}at, Nicolas Boumal

PDF

Open Access

TL;DR

This paper proves that a modified line-search gradient descent method can avoid strict saddle points on smooth functions, extending the guarantee to Riemannian manifolds and relaxing common assumptions.

Contribution

It introduces a new convergence guarantee for line-search gradient descent avoiding strict saddles, using the Luzin N^{-1} property and extending to Riemannian optimization.

Findings

01

Line-search GD avoids strict saddles on $C^2$ functions.

02

Extension of guarantees to Riemannian gradient descent.

03

Improved convergence guarantees for RGD with constant step size.

Abstract

It is known that gradient descent (GD) on a $C^{2}$ cost function generically avoids strict saddle points when using a small, constant step size. However, no such guarantee existed for GD with a line-search method. We provide one for a modified version of the standard Armijo backtracking method with generic, arbitrarily large initial step size. The proof underlines the double role of the Luzin $N^{- 1}$ property for the iteration maps, and allows to forgo the habitual Lipschitz gradient assumption. We extend this to the Riemannian setting (RGD), assuming the retraction is real analytic (though the cost function still only needs to be $C^{2}$ ). In closing, we also improve guarantees for RGD with a constant step size in some scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Geometric Analysis and Curvature Flows