Leveraging Procedural Generation to Benchmark Reinforcement Learning

Karl Cobbe; Christopher Hesse; Jacob Hilton; John Schulman

arXiv:1912.01588·cs.LG·July 28, 2020·171 cites

Leveraging Procedural Generation to Benchmark Reinforcement Learning

Karl Cobbe, Christopher Hesse, Jacob Hilton, John Schulman

PDF

Open Access 5 Repos 1 Video

TL;DR

This paper presents the Procgen Benchmark, a set of procedurally generated environments for evaluating reinforcement learning agents on sample efficiency and generalization, highlighting the importance of environment diversity and model scaling.

Contribution

The paper introduces a new benchmark suite for RL, provides experimental protocols, and demonstrates the benefits of procedural generation and larger models for RL performance.

Findings

01

Diverse environment distributions are crucial for effective RL training and evaluation.

02

Procedural content generation enhances the robustness of RL benchmarks.

03

Scaling model size improves both sample efficiency and generalization in RL agents.

Abstract

We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased access to high quality training environments, and we provide detailed experimental protocols for using this benchmark. We empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation. We then use this benchmark to investigate the effects of scaling model size, finding that larger models significantly improve both sample efficiency and generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Leveraging Procedural Generation to Benchmark Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Sports Analytics and Performance