Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning
R. Srikant, Lei Ying

TL;DR
This paper derives finite-time error bounds for linear stochastic approximation algorithms driven by Markovian noise, providing insights into moments of the error and solving an open problem in TD learning performance analysis.
Contribution
It introduces finite-time bounds for moments of the error in linear stochastic approximation and TD learning without i.i.d. noise or projection, advancing theoretical understanding.
Findings
Lower-order moments can be made small with step-size adjustments.
Higher-order moments may be infinite in steady-state.
Sample complexity for bounds matching steady-state performance.
Abstract
We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finite-time bounds on the mean-square error in the case of constant step-size algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function can be interpreted either in terms of Stein's method to obtain bounds on steady-state performance or in terms of Lyapunov stability theory for linear ODEs. We also provide a comprehensive treatment of the moments of the square of the 2-norm of the approximation error. Our analysis yields the following results: (i) for a given step-size, we show that the lower-order moments can be made small as a function of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Markov Chains and Monte Carlo Methods · Reinforcement Learning in Robotics
