Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement   Learning

Shengyi Huang; Quentin Gallou\'edec; Florian Felten; Antonin; Raffin; Rousslan Fernand Julien Dossa; Yanxiao Zhao; Ryan Sullivan; and Viktor Makoviychuk; Denys Makoviichuk; Mohamad H. Danesh; Cyril; Roum\'egous; Jiayi Weng; Chufan Chen; Md Masudur Rahman; Jo\~ao; G. M. Ara\'ujo; Guorui Quan; Daniel Tan; Timo Klein; Rujikorn; Charakorn; Mark Towers; Yann Berthelot; Kinal Mehta; Dipam; Chakraborty; Arjun KG; Valentin Charraut; Chang Ye; Zichen Liu; and Lucas N. Alegre; Alexander Nikulin; Xiao Hu; Tianlin Liu and; Jongwook Choi; Brent Yi

arXiv:2402.03046·cs.LG·February 6, 2024·2 cites

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Shengyi Huang, Quentin Gallou\'edec, Florian Felten, Antonin, Raffin, Rousslan Fernand Julien Dossa, Yanxiao Zhao, Ryan Sullivan, and Viktor Makoviychuk, Denys Makoviichuk, Mohamad H. Danesh, Cyril, Roum\'egous, Jiayi Weng, Chufan Chen, Md Masudur Rahman, Jo\~ao, G. M. Ara\'ujo

PDF

Open Access 1 Repo

TL;DR

Open RL Benchmark provides a comprehensive, community-driven collection of fully tracked reinforcement learning experiments, enabling reproducibility, detailed analysis, and easier comparison of RL algorithms.

Contribution

It introduces the first extensive, fully tracked RL benchmark dataset with reproducibility features and a CLI tool for analysis, covering over 25,000 runs from multiple libraries.

Findings

01

Over 25,000 tracked RL runs with detailed metrics.

02

Provides reproducibility through full parameters and dependency versions.

03

Includes case studies demonstrating practical use.

Abstract

In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

openrlbenchmark/openrlbenchmark
jax

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Open Source Software Innovations · Software Engineering Research

MethodsSparse Evolutionary Training