Optimal Local Convergence Rates of Stochastic First-Order Methods under Local $\alpha$-PL

Saeed Masiha; Saber Salehkaleybar; Niao He; Negar Kiyavash; and Patrick Thiran

arXiv:2408.01839·math.OC·February 24, 2026

Optimal Local Convergence Rates of Stochastic First-Order Methods under Local $\alpha$-PL

Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, and Patrick Thiran

PDF

Open Access

TL;DR

This paper establishes the optimal local convergence rates of stochastic first-order methods under a local -PL condition, providing matching lower and upper bounds that depend on the parameter , with implications for both non-convex and convex optimization.

Contribution

The paper derives tight bounds for stochastic first-order methods under a local -PL condition, extending understanding of convergence rates across different regimes.

Findings

01

Lower bound of (\u03b5)^{-2/} for all stochastic first-order methods.

02

Matching upper bound achieved by a SARAH-type variance-reduced method.

03

Complexity bounds in the convex setting under local -PL condition.

Abstract

We study the local convergence rate of stochastic first-order methods under a local $α$ -Polyak-Lojasiewicz ( $α$ -PL) condition in a neighborhood of a target connected component $M$ of the local minimizer set. The parameter $α \in [1, 2]$ is the exponent of the gradient norm in the $α$ -PL inequality: $α = 2$ recovers the classical PL case, $α = 1$ corresponds to Holder-type error bounds, and intermediate values interpolate between these regimes. Our performance criterion is the number of oracle queries required to output $\overset{x}{^}$ with $F (\overset{x}{^}) - l \leq ε$ , where $l := F (y)$ for any $y \in M$ . We work in a local regime where the algorithm is initialized near $M$ and, with high probability, its iterates remain in that neighborhood. We establish a lower bound $Ω (ε^{- 2/ α})$ for all stochastic first-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Advanced Memory and Neural Computing