Finite-Time Error Bounds For Linear Stochastic Approximation and TD   Learning

R. Srikant; Lei Ying

arXiv:1902.00923·cs.LG·March 11, 2019·42 cites

Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

R. Srikant, Lei Ying

PDF

Open Access

TL;DR

This paper derives finite-time error bounds for linear stochastic approximation algorithms driven by Markovian noise, providing insights into moments of the error and solving an open problem in TD learning performance analysis.

Contribution

It introduces finite-time bounds for moments of the error in linear stochastic approximation and TD learning without i.i.d. noise or projection, advancing theoretical understanding.

Findings

01

Lower-order moments can be made small with step-size adjustments.

02

Higher-order moments may be infinite in steady-state.

03

Sample complexity for bounds matching steady-state performance.

Abstract

We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finite-time bounds on the mean-square error in the case of constant step-size algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function can be interpreted either in terms of Stein's method to obtain bounds on steady-state performance or in terms of Lyapunov stability theory for linear ODEs. We also provide a comprehensive treatment of the moments of the square of the 2-norm of the approximation error. Our analysis yields the following results: (i) for a given step-size, we show that the lower-order moments can be made small as a function of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Markov Chains and Monte Carlo Methods · Reinforcement Learning in Robotics