Generalization of Silver Stepsize Schedule to Stochastic Optimization
Luwei Bai, Yang Zeng, Baoyu Zhou

TL;DR
This paper extends a known deterministic stepsize schedule to stochastic optimization, proposing a two-step schedule that improves convergence when the variance of stochastic gradients is relatively small.
Contribution
It introduces a generalized two-step stepsize schedule for stochastic gradient methods, improving convergence under certain variance conditions.
Findings
Achieves better convergence than constant stepsize when variance is small.
Applicable to smooth strongly convex functions with stochastic gradients.
Generalizes the silver stepsize schedule from deterministic to stochastic settings.
Abstract
This work introduces a two-step stepsize schedule for stochastic gradient methods minimizing smooth strongly convex functions. We consider the setting where only stochastic gradient approximations, which are unbiased, of bounded variance, and supported on a finite set, are accessible. When the variance bound is relatively smaller than a ratio of the initial optimality gap, the proposed stepsize schedule achieves better convergence performance compared to the well-regarded constant stepsize {\alpha} = 2/(M+m), where m and M denote the strong convexity and gradient-Lipschitz parameters, respectively. Our stepsize schedule can be viewed as a generalization of the well-known two-step silver stepsize schedule in [J. M. Altschuler and P. A. Parrilo, Journal of the ACM, 72(2):1-38, 2025] from deterministic setting to stochastic optimization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research
