Lower Bounds for Finding Stationary Points II: First-Order Methods
Yair Carmon, John C. Duchi, Oliver Hinder, Aaron Sidford

TL;DR
This paper establishes fundamental lower bounds on the efficiency of first-order methods for finding approximate stationary points in non-convex optimization, revealing inherent complexity limitations.
Contribution
It provides new theoretical lower bounds on convergence rates for deterministic first-order methods in non-convex optimization, extending understanding of their fundamental limitations.
Findings
Deterministic first-order methods cannot surpass an $ ext{epsilon}^{-8/5}$ convergence rate for general smooth functions.
For functions with Lipschitz first and second derivatives, the lower bound is $ ext{epsilon}^{-12/7}$.
Convex functions with Lipschitz gradient allow faster convergence, achieving $ ext{epsilon}^{-1} ext{log}(1/ ext{epsilon})$.
Abstract
We establish lower bounds on the complexity of finding -stationary points of smooth, non-convex high-dimensional functions using first-order methods. We prove that deterministic first-order methods, even applied to arbitrarily smooth functions, cannot achieve convergence rates in better than , which is within of the best known rate for such methods. Moreover, for functions with Lipschitz first and second derivatives, we prove no deterministic first-order method can achieve convergence rates better than , while is a lower bound for functions with only Lipschitz gradient. For convex functions with Lipschitz gradient, accelerated gradient descent achieves the rate , showing that finding stationary points is easier given convexity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
