ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning

Jannis Becktepe; Julian Dierkes; Carolin Benjamins; Aditya Mohan; David Salinas; Raghu Rajan; Frank Hutter; Holger Hoos; Marius Lindauer; Theresa Eimer

arXiv:2409.18827·cs.LG·March 11, 2026

ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning

Jannis Becktepe, Julian Dierkes, Carolin Benjamins, Aditya Mohan, David Salinas, Raghu Rajan, Frank Hutter, Holger Hoos, Marius Lindauer, Theresa Eimer

PDF

Open Access 1 Repo 1 Datasets

TL;DR

ARLBench is a new benchmark designed to facilitate efficient and flexible comparison of hyperparameter optimization methods in reinforcement learning, enabling broader research with limited computational resources.

Contribution

It introduces a scalable benchmark and dataset for HPO in RL, allowing diverse approaches to be evaluated efficiently across multiple domains.

Findings

01

Enables comparison of HPO methods with reduced compute

02

Provides a representative set of RL tasks for benchmarking

03

Supports research in AutoRL with accessible resources

Abstract

Hyperparameters are a critical factor in reliably training well-performing reinforcement learning (RL) agents. Unfortunately, developing and evaluating automated approaches for tuning such hyperparameters is both costly and time-consuming. As a result, such approaches are often only evaluated on a single domain or algorithm, making comparisons difficult and limiting insights into their generalizability. We propose ARLBench, a benchmark for hyperparameter optimization (HPO) in RL that allows comparisons of diverse HPO approaches while being highly efficient in evaluation. To enable research into HPO in RL, even in settings with low compute resources, we select a representative subset of HPO tasks spanning a variety of algorithm and environment combinations. This selection allows for generating a performance profile of an automated RL (AutoRL) method using only a fraction of the compute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

automl/arlbench
jaxOfficial

Datasets

autorl-org/arlbench
dataset· 1.0k dl
1.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Machine Learning and Data Classification

MethodsHyper-parameter optimization