A non-autonomous center-stable set theorem for saddle avoidance in optimization
Andreea-Alexandra Mu\c{s}at, Nicolas Boumal

TL;DR
This paper develops a new non-autonomous Center-Stable Set Theorem to analyze optimization algorithms like gradient descent, proving saddle avoidance even with vanishing step sizes and non-isolated saddles.
Contribution
It introduces a novel non-autonomous CSST that extends existing theorems to handle variable step sizes and non-autonomous dynamics in optimization.
Findings
Proves saddle avoidance for gradient descent with non-constant step sizes.
Extends analysis to Riemannian gradient descent and proximal point methods.
Handles scenarios with non-Lipschitz gradients and non-isolated saddle points.
Abstract
Optimization algorithms are unlikely to converge to strict saddle points. Proofs to that effect rely on the Center-Stable Manifold Theorem (CSMT), casting algorithms as dynamical systems: . In its standard form, the CSMT is limited to autonomous systems (the maps are all the same). To study algorithms such as gradient descent with non-constant step-size schedules, we need a non-autonomous CSMT. There are a few, but they are unable to handle, for example, vanishing step sizes. To cover such scenarios, we establish a new Center-Stable Set Theorem (CSST) for non-autonomous systems. We use it to prove saddle avoidance for gradient descent (Euclidean and Riemannian) and for the proximal point method, without assuming Lipschitz gradients or isolated saddles, and allowing vanishing step sizes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
