MSRL: Distributed Reinforcement Learning with Dataflow Fragments

Huanzhou Zhu; Bo Zhao; Gang Chen; Weifeng Chen; Yijie Chen; Liang Shi,; Yaodong Yang; Peter Pietzuch; Lei Chen

arXiv:2210.00882·cs.LG·October 31, 2022

MSRL: Distributed Reinforcement Learning with Dataflow Fragments

Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi,, Yaodong Yang, Peter Pietzuch, Lei Chen

PDF

Open Access

TL;DR

MSRL introduces a flexible distributed reinforcement learning system that decouples algorithms from execution strategies using dataflow fragments, enabling scalable training across large GPU clusters without modifying algorithms.

Contribution

The paper presents MSRL, a novel system that uses dataflow fragments to abstractly and flexibly distribute RL training across clusters, surpassing existing hard-coded strategies.

Findings

01

Supports distribution policies without algorithm changes

02

Scales RL training to 64 GPUs

03

Subsumes existing distribution strategies

Abstract

Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training algorithms offer different opportunities for distributing and parallelising the computation. Yet, current distributed RL systems tie the definition of RL algorithms to their distributed execution: they hard-code particular distribution strategies and only accelerate specific parts of the computation (e.g. policy network updates) on GPU workers. Fundamentally, current systems lack abstractions that decouple RL algorithms from their execution. We describe MindSpore Reinforcement Learning (MSRL), a distributed RL training system that supports distribution policies that govern how RL training computation is parallelised and distributed on cluster resources, without requiring changes to the algorithm implementation. MSRL introduces the new abstraction of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices