Finite-Time Accuracy of Temporal-Difference Learning Under Schur-Stable Recursions

Donghwan Lee; Do Wan Kim

arXiv:2204.10479·cs.LG·February 2, 2026

Finite-Time Accuracy of Temporal-Difference Learning Under Schur-Stable Recursions

Donghwan Lee, Do Wan Kim

PDF

Open Access

TL;DR

This paper develops a new finite-time error analysis for tabular TD learning in reinforcement learning, utilizing control-theoretic methods and Schur stability to provide insights and a reusable framework for finite-sample analysis.

Contribution

It introduces a novel finite-time error analysis framework for TD learning that exploits Schur stability and stochastic linear system representation, offering new theoretical insights.

Findings

01

Provides finite-time error bounds for TD learning.

02

Introduces a control-theoretic analysis framework.

03

Offers insights for future finite-sample RL research.

Abstract

Temporal difference (TD) learning is a cornerstone reinforcement learning (RL) method for policy evaluation, where the goal is to estimate the value function of a Markov decision process under a fixed policy. While a substantial body of work has established its convergence and stability properties, more recent efforts have focused on its statistical efficiency through finite-time error bounds. In this paper, we advance this line of research by developing a new finite-time error analysis for tabular TD learning that directly exploits a discrete-time stochastic linear system representation and leverages Schur stability of the associated matrices. Beyond the specific bounds obtained, the proposed framework provides a reusable template for analyzing TD learning and related RL algorithms, and it offers control-theoretic insights that may guide future developments in finite-sample RL theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovation Diffusion and Forecasting · Traffic control and management