Distributed TD Tracking with Linear Function Approximation over Directed Communication Networks

Haocheng Yang; Shengchao Zhao; Yongchao Liu

arXiv:2605.04466·math.OC·May 7, 2026

Distributed TD Tracking with Linear Function Approximation over Directed Communication Networks

Haocheng Yang, Shengchao Zhao, Yongchao Liu

PDF

TL;DR

This paper introduces PP-DTD, a novel distributed policy evaluation algorithm for multi-agent reinforcement learning over directed networks, achieving fast convergence and robustness.

Contribution

It presents the first distributed TD-based policy evaluation algorithm for directed graphs with proven linear convergence rates.

Findings

01

PP-DTD achieves linear convergence to a neighborhood of the optimum.

02

The algorithm demonstrates a convergence rate of O(T^{-1}) with decaying step-sizes.

03

Numerical experiments show robustness and effectiveness in cooperative tasks.

Abstract

We study the policy evaluation problem in multi-agent reinforcement learning (MARL) over directed communication networks, where agents cooperate with each other to explore an unknown environment and accomplish a specific task. We propose a Push-Pull-type distributed algorithm, named PP-DTD, for policy evaluation in MARL within the framework of temporal difference (TD) learning with linear function approximation. PP-DTD integrates TD learning with the Push-Pull mechanism to accommodate directed communication networks, and further utilizes variance reduction techniques to enhance both algorithmic stability and convergence rate. We show that PP-DTD achieves linear convergence to a neighborhood of the optimum under constant step-sizes and a convergence rate of $O (T^{- 1})$ under decaying step-sizes when the sample is independent and identically distributed or Markovian. To the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.