Optimal convergence rates for Nesterov acceleration
Jean Fran\c{c}ois Aujol (IMB), Charles Dossal (IMT), Aude Rondepierre, (IMT, LAAS-ROC)

TL;DR
This paper investigates the convergence behavior of Nesterov acceleration, revealing that under certain geometric conditions, improved rates are achievable, and that classical Nesterov schemes may underperform on sharp functions compared to gradient descent.
Contribution
The paper introduces new convergence rates for Nesterov acceleration based on geometrical properties like the Łojasiewicz condition, highlighting limitations of classical schemes.
Findings
Better convergence rates are possible with geometric conditions.
Classical Nesterov may perform worse than gradient descent on sharp functions.
Convergence rates depend on the geometry of the objective function.
Abstract
In this paper, we study the behavior of solutions of the ODE associated to Nesterov acceleration. It is well-known since the pioneering work of Nesterov that the rate of convergence is optimal for the class of convex functions with Lipschitz gradient. In this work, we show that better convergence rates can be obtained with some additional geometrical conditions, such as \L ojasiewicz property. More precisely, we prove the optimal convergence rates that can be obtained depending on the geometry of the function to minimize. The convergence rates are new, and they shed new light on the behavior of Nesterov acceleration schemes. We prove in particular that the classical Nesterov scheme may provide convergence rates that are worse than the classical gradient descent scheme on sharp functions: for instance, the convergence rate for strongly convex functions is not geometric for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
