MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning

Raghu Rajan; Jessica Lizeth Borja Diaz; Suresh Guttikonda; Fabio; Ferreira; Andr\'e Biedenkapp; Jan Ole von Hartz; Frank Hutter

arXiv:1909.07750·cs.LG·July 18, 2023

MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning

Raghu Rajan, Jessica Lizeth Borja Diaz, Suresh Guttikonda, Fabio, Ferreira, Andr\'e Biedenkapp, Jan Ole von Hartz, Frank Hutter

PDF

1 Repo

TL;DR

MDP Playground is a versatile testbed for RL agents that allows controlled manipulation of environment difficulty across various dimensions, aiding in understanding, debugging, and improving reinforcement learning algorithms in both toy and complex settings.

Contribution

Introduces a flexible, parameterized RL testbed with controllable environment dimensions, enabling systematic analysis and debugging of RL agents across toy and real-world environments.

Findings

01

Controlled environment dimensions reveal their impact on agent performance.

02

Wrappers effectively inject dimensions into existing environments.

03

Insights gained help improve RL agent robustness and understanding.

Abstract

We present MDP Playground, a testbed for Reinforcement Learning (RL) agents with dimensions of hardness that can be controlled independently to challenge agents in different ways and obtain varying degrees of hardness in toy and complex RL environments. We consider and allow control over a wide variety of dimensions, including delayed rewards, sequence lengths, reward density, stochasticity, image representations, irrelevant features, time unit, action range and more. We define a parameterised collection of fast-to-run toy environments in OpenAI Gym by varying these dimensions and propose to use these to understand agents better. We then show how to design experiments using MDP Playground to gain insights on the toy environments. We also provide wrappers that can inject many of these dimensions into any Gym environment. We experiment with these wrappers on Atari and Mujoco to allow for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

automl/mdp-playground
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.