Convergence of TD(0) under Polynomial Mixing with Nonlinear Function Approximation

Anupama Sridhar; Alexander Johansen

arXiv:2502.05706·stat.ML·May 22, 2025

Convergence of TD(0) under Polynomial Mixing with Nonlinear Function Approximation

Anupama Sridhar, Alexander Johansen

PDF

Open Access

TL;DR

This paper provides the first finite-sample, high-probability analysis of vanilla TD(0) with nonlinear function approximation under polynomial mixing, showing convergence rates comparable to i.i.d. data without requiring projections or subsampling.

Contribution

It introduces a novel analysis technique for nonlinear TD(0) under polynomial mixing, removing previous restrictions like subsampling and projections.

Findings

01

Convergence rates match i.i.d. scenarios.

02

High-probability bounds hold for nonstationary initialization.

03

Novel coupling method bypasses geometric ergodicity.

Abstract

Temporal Difference Learning (TD(0)) is fundamental in reinforcement learning, yet its finite-sample behavior under non-i.i.d. data and nonlinear approximation remains unknown. We provide the first high-probability, finite-sample analysis of vanilla TD(0) on polynomially mixing Markov data, assuming only Holder continuity and bounded generalized gradients. This breaks with previous work, which often requires subsampling, projections, or instance-dependent step-sizes. Concretely, for mixing exponent $β > 1$ , Holder continuity exponent $γ$ , and step-size decay rate $η \in (1/2, 1]$ , we show that, with high probability, \[ \| \theta_t - \theta^* \| \leq C(\beta, \gamma, \eta)\, t^{-\beta/2} + C'(\gamma, \eta)\, t^{-\eta\gamma} \] after $t = O (1/ ε^{2})$ iterations. These bounds match the known i.i.d. rates and hold even when initialization is nonstationary.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms · Control Systems and Identification