A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
Yang Peng, Kaicheng Jin, Liangyu Zhang, Zhihua Zhang

TL;DR
This paper provides a finite-sample analysis of distributional TD learning with linear function approximation, showing it has similar sample complexity to classic TD learning, thus efficiently estimating return distributions.
Contribution
It is the first to derive sharp finite-sample rates for distributional TD learning with linear function approximation, matching the complexity of traditional TD methods.
Findings
Sample complexity matches that of classic linear TD learning.
Distributional TD learning is statistically as efficient as learning the value function.
Provides theoretical insights into the statistical efficiency of distributional reinforcement learning.
Abstract
In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The aim of distributional TD learning is to estimate the return distribution of a discounted Markov decision process for a given policy {\pi}. Previous works on statistical analysis of distributional TD learning mainly focus on the tabular case. In contrast, we first consider the linear function approximation setting and derive sharp finite-sample rates. Our theoretical results demonstrate that the sample complexity of linear distributional TD learning matches that of classic linear TD learning. This implies that, with linear function approximation, learning the full distribution of the return from streaming data is no more difficult than learning its expectation (value function). To derive tight sample complexity bounds, we conduct a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Speech and Audio Processing · Advanced Algorithms and Applications
MethodsFocus
