Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation

Faruk Alpay; Hamdi Alakkad

arXiv:2508.16540·cs.LG·August 25, 2025

Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation

Faruk Alpay, Hamdi Alakkad

PDF

TL;DR

This paper introduces a new algorithm, PSD, for efficiently escaping saddle points in non-convex optimization, with explicit constants and validated through extensive experiments in machine learning tasks.

Contribution

The paper provides a complete theoretical analysis of the PSD algorithm with explicit constants, separating gradient descent and saddle escape phases, and introduces variants for practical use.

Findings

01

PSD finds approximate second-order stationary points efficiently.

02

Theoretical bounds match empirical performance.

03

Logarithmic dependence on problem dimension is confirmed.

Abstract

We present a comprehensive theoretical analysis of first-order methods for escaping strict saddle points in smooth non-convex optimization. Our main contribution is a Perturbed Saddle-escape Descent (PSD) algorithm with fully explicit constants and a rigorous separation between gradient-descent and saddle-escape phases. For a function $f : R^{d} \to R$ with $ℓ$ -Lipschitz gradient and $ρ$ -Lipschitz Hessian, we prove that PSD finds an $(ϵ, ρ ϵ)$ -approximate second-order stationary point with high probability using at most $O (ℓ Δ_{f} / ϵ^{2})$ gradient evaluations for the descent phase plus $O ((ℓ / ρ ϵ) lo g (d / δ))$ evaluations per escape episode, with at most $O (ℓ Δ_{f} / ϵ^{2})$ episodes needed. We validate our theoretical predictions through extensive experiments across both synthetic functions and practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.