Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule
Jason M. Altschuler, Pablo A. Parrilo

TL;DR
This paper introduces the Silver Stepsize Schedule, a novel stepsize strategy that accelerates gradient descent convergence on convex functions without altering the algorithm, achieving rates between unaccelerated and Nesterov's accelerated methods.
Contribution
The paper proposes a fully explicit, non-monotonic, fractal-like stepsize schedule that improves convergence rates, bridging the gap between unaccelerated and accelerated gradient descent.
Findings
Achieves convergence in approximately k^{0.7864} iterations for strongly convex functions.
Provides a recursive, explicit construction of the Silver Stepsize Schedule.
Suggests the rates are optimal among all stepsize schedules.
Abstract
Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in iterations, where is the silver ratio and is the condition number. This is intermediate between the textbook unaccelerated rate and the accelerated rate due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate . We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Neural Networks and Applications
