rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

Adam Stooke; Pieter Abbeel

arXiv:1909.01500·cs.LG·September 25, 2019·52 cites

rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

Adam Stooke, Pieter Abbeel

PDF

Open Access 5 Repos

TL;DR

rlpyt is a comprehensive, modular, and optimized PyTorch-based code base that implements all three major families of deep reinforcement learning algorithms, facilitating research and experimentation.

Contribution

It provides the first unified, shared infrastructure implementing deep Q-learning, policy gradients, and Q-value policy gradients in a single repository.

Findings

01

Supports all three major RL algorithm families

02

Optimized for high-throughput small- to medium-scale research

03

Modular design enables easy experimentation and extension

Abstract

Since the recent advent of deep reinforcement learning for game play and simulated robotic control, a multitude of new algorithms have flourished. Most are model-free algorithms which can be categorized into three families: deep Q-learning, policy gradients, and Q-value policy gradients. These have developed along separate lines of research, such that few, if any, code bases incorporate all three kinds. Yet these algorithms share a great depth of common deep reinforcement learning machinery. We are pleased to share rlpyt, which implements all three algorithm families on top of a shared, optimized infrastructure, in a single repository. It contains modular implementations of many common deep RL algorithms in Python using PyTorch, a leading deep learning library. rlpyt is designed as a high-throughput code base for small- to medium-scale research in deep RL. This white paper summarizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Parallel Computing and Optimization Techniques