Settling the Communication Complexity for Distributed Offline   Reinforcement Learning

Juliusz Krysztof Ziomek; Jun Wang; Yaodong Yang

arXiv:2202.04862·stat.ML·February 11, 2022·1 cites

Settling the Communication Complexity for Distributed Offline Reinforcement Learning

Juliusz Krysztof Ziomek, Jun Wang, Yaodong Yang

PDF

Open Access

TL;DR

This paper establishes fundamental limits on communication efficiency in distributed offline reinforcement learning, providing lower bounds and algorithms that achieve near-optimal risk under strict communication constraints.

Contribution

It introduces the first minimax lower bounds for distributed offline RL and proposes algorithms that nearly attain these bounds under single-round communication.

Findings

01

Lower bounds on communication bits scale as Ω(AC) for contextual bandits.

02

Proposed algorithms based on least-squares and Monte-Carlo estimates achieve near-optimal risk.

03

Temporal difference methods are less effective under the communication constraints.

Abstract

We study a novel setting in offline reinforcement learning (RL) where a number of distributed machines jointly cooperate to solve the problem but only one single round of communication is allowed and there is a budget constraint on the total number of information (in terms of bits) that each machine can send out. For value function prediction in contextual bandits, and both episodic and non-episodic MDPs, we establish information-theoretic lower bounds on the minimax risk for distributed statistical estimators; this reveals the minimum amount of communication required by any offline RL algorithms. Specifically, for contextual bandits, we show that the number of bits must scale at least as $Ω (A C)$ to match the centralised minimax optimal rate, where $A$ is the number of actions and $C$ is the context dimension; meanwhile, we reach similar results in the MDP settings. Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Evolutionary Algorithms and Applications