Scaling Distributed Multi-task Reinforcement Learning with Experience   Sharing

Sanae Amani; Khushbu Pahwa; Vladimir Braverman; Lin F. Yang

arXiv:2307.05834·cs.LG·July 13, 2023

Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing

Sanae Amani, Khushbu Pahwa, Vladimir Braverman, Lin F. Yang

PDF

Open Access

TL;DR

This paper introduces DistMT-LSVI, a distributed reinforcement learning algorithm that enables multiple agents to collaboratively learn multiple tasks efficiently through experience sharing, significantly reducing sample complexity compared to non-distributed methods.

Contribution

The paper proposes a novel distributed RL algorithm, DistMT-LSVI, with theoretical guarantees and empirical validation, advancing multi-task learning efficiency in multi-agent systems.

Findings

01

DistMT-LSVI achieves near-optimal sample complexity with a factor of 1/N improvement.

02

Theoretical analysis provides bounds on episodes needed for epsilon-optimal policies.

03

Empirical results on Atari environments support the theoretical claims.

Abstract

Recently, DARPA launched the ShELL program, which aims to explore how experience sharing can benefit distributed lifelong learning agents in adapting to new challenges. In this paper, we address this issue by conducting both theoretical and empirical research on distributed multi-task reinforcement learning (RL), where a group of $N$ agents collaboratively solves $M$ tasks without prior knowledge of their identities. We approach the problem by formulating it as linearly parameterized contextual Markov decision processes (MDPs), where each task is represented by a context that specifies the transition dynamics and rewards. To tackle this problem, we propose an algorithm called DistMT-LSVI. First, the agents identify the tasks, and then they exchange information through a central server to derive $ϵ$ -optimal policies for the tasks. Our research demonstrates that to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Smart Grid Energy Management