Two-Timescale Linear Stochastic Approximation: Constant Stepsizes Go a Long Way
Jeongyeol Kwon, Luke Dotson, Yudong Chen, Qiaomin Xie

TL;DR
This paper analyzes constant stepsize two-timescale stochastic approximation, showing convergence to a stationary distribution with explicit rates and bias/variance characterizations, enabling improved error bounds without restrictive assumptions.
Contribution
It provides the first detailed analysis of constant stepsize two-timescale SA, including explicit convergence rates and bias/variance scaling, without requiring restrictive conditions.
Findings
Iterates converge to a unique stationary distribution in Wasserstein metric.
Bias scales linearly with stepsizes, variance scales with individual stepsizes.
Tail-averaging reduces mean-squared error to near-optimal bounds.
Abstract
Previous studies on two-timescale stochastic approximation (SA) mainly focused on bounding mean-squared errors under diminishing stepsize schemes. In this work, we investigate {\it constant} stpesize schemes through the lens of Markov processes, proving that the iterates of both timescales converge to a unique joint stationary distribution in Wasserstein metric. We derive explicit geometric and non-asymptotic convergence rates, as well as the variance and bias introduced by constant stepsizes in the presence of Markovian noise. Specifically, with two constant stepsizes , we show that the biases scale linearly with both stepsizes as up to higher-order terms, while the variance of the slower iterate (resp., faster iterate) scales only with its own stepsize as (resp., ). Unlike previous work, our results require no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications
