Multi Timescale Stochastic Approximation: Stability and Convergence
Rohan Deb, Swetha Ganesh, Shalabh Bhatnagar

TL;DR
This paper establishes comprehensive conditions for the stability and convergence of multi-timescale stochastic approximation algorithms, extending previous results to N-timescale systems and applying them to reinforcement learning methods with momentum, off-policy learning, and constrained optimization.
Contribution
It provides the first unified framework with sufficient conditions for stability and convergence of N-timescale stochastic approximation algorithms, including practical RL algorithms with momentum and primal-dual methods.
Findings
Proved stability and convergence for N-timescale stochastic approximation.
Applied framework to momentum-augmented Gradient TD learning.
Guaranteed convergence of off-policy actor-critic and constrained policy optimization.
Abstract
This paper presents the first sufficient conditions that guarantee the stability and almost sure convergence of multi-timescale stochastic approximation (SA) iterates. It extends the existing results on one-timescale and two-timescale SA iterates to general -timescale stochastic recursions, for any , using the ordinary differential equation (ODE) method. As an application, we study SA algorithms augmented with heavy-ball momentum in the context of Gradient Temporal Difference (GTD) learning. The added momentum introduces an auxiliary state evolving on an intermediate timescale, yielding a three-timescale recursion. We show that with appropriate momentum parameters, the scheme fits within our framework and converges almost surely to the same fixed point as baseline GTD. The stability and convergence of all iterates including the momentum state follow from our main results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Stochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods
