Revisiting Step-Size Assumptions in Stochastic Approximation
Caio Kalil Lauand, Sean Meyn

TL;DR
This paper challenges the traditional assumption that the step-size sequence in stochastic approximation must be square summable, showing convergence under weaker conditions and providing refined convergence rates and optimal covariance results.
Contribution
It proves that the square summability condition on step-sizes is unnecessary for convergence and extends the theory to parameter-dependent Markovian noise, with improved convergence rates and bias analysis.
Findings
Convergence occurs without the need for square summability of step-sizes.
Averaging techniques improve the mean-squared error rate to O(max{α_n^2, 1/n}).
Covariance of estimates is optimal under certain conditions.
Abstract
Many machine learning and optimization algorithms are built upon the framework of stochastic approximation (SA), for which the selection of step-size (or learning rate) is crucial for success. An essential condition for convergence is the assumption that . Moreover, in all theory to date it is assumed that (the sequence is square summable). In this paper it is shown for the first time that this assumption is not required for convergence and finer results. The main results are restricted to the special case with . The theory allows for parameter dependent Markovian noise as found in many applications of interest to the machine learning and optimization research communities. Rates of convergence are obtained for the standard algorithm, and for estimates obtained via the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
