rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
Adam Stooke, Pieter Abbeel

TL;DR
rlpyt is a comprehensive, modular, and optimized PyTorch-based code base that implements all three major families of deep reinforcement learning algorithms, facilitating research and experimentation.
Contribution
It provides the first unified, shared infrastructure implementing deep Q-learning, policy gradients, and Q-value policy gradients in a single repository.
Findings
Supports all three major RL algorithm families
Optimized for high-throughput small- to medium-scale research
Modular design enables easy experimentation and extension
Abstract
Since the recent advent of deep reinforcement learning for game play and simulated robotic control, a multitude of new algorithms have flourished. Most are model-free algorithms which can be categorized into three families: deep Q-learning, policy gradients, and Q-value policy gradients. These have developed along separate lines of research, such that few, if any, code bases incorporate all three kinds. Yet these algorithms share a great depth of common deep reinforcement learning machinery. We are pleased to share rlpyt, which implements all three algorithm families on top of a shared, optimized infrastructure, in a single repository. It contains modular implementations of many common deep RL algorithms in Python using PyTorch, a leading deep learning library. rlpyt is designed as a high-throughput code base for small- to medium-scale research in deep RL. This white paper summarizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Parallel Computing and Optimization Techniques
