Federated Temporal Difference Learning with Linear Function   Approximation under Environmental Heterogeneity

Han Wang; Aritra Mitra; Hamed Hassani; George J. Pappas; James; Anderson

arXiv:2302.02212·cs.LG·July 2, 2024·6 cites

Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity

Han Wang, Aritra Mitra, Hamed Hassani, George J. Pappas, James, Anderson

PDF

Open Access

TL;DR

This paper provides the first finite-time analysis of federated TD learning with linear function approximation, demonstrating linear speedups in policy evaluation under low environmental heterogeneity.

Contribution

It introduces a comprehensive analysis of federated TD learning considering heterogeneity, Markovian sampling, and communication efficiency, with novel perturbation bounds and a virtual MDP framework.

Findings

01

Exchanging information accelerates policy evaluation in low-heterogeneity settings.

02

The analysis establishes conditions for linear convergence speedups with multiple agents.

03

Novel perturbation bounds relate heterogeneity to TD fixed points.

Abstract

We initiate the study of federated reinforcement learning under environmental heterogeneity by considering a policy evaluation problem. Our setup involves $N$ agents interacting with environments that share the same state and action space but differ in their reward functions and state transition kernels. Assuming agents can communicate via a central server, we ask: Does exchanging information expedite the process of evaluating a common policy? To answer this question, we provide the first comprehensive finite-time analysis of a federated temporal difference (TD) learning algorithm with linear function approximation, while accounting for Markovian sampling, heterogeneity in the agents' environments, and multiple local updates to save communication. Our analysis crucially relies on several novel ingredients: (i) deriving perturbation bounds on TD fixed points as a function of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics