TL;DR
MDP Playground is a versatile testbed for RL agents that allows controlled manipulation of environment difficulty across various dimensions, aiding in understanding, debugging, and improving reinforcement learning algorithms in both toy and complex settings.
Contribution
Introduces a flexible, parameterized RL testbed with controllable environment dimensions, enabling systematic analysis and debugging of RL agents across toy and real-world environments.
Findings
Controlled environment dimensions reveal their impact on agent performance.
Wrappers effectively inject dimensions into existing environments.
Insights gained help improve RL agent robustness and understanding.
Abstract
We present MDP Playground, a testbed for Reinforcement Learning (RL) agents with dimensions of hardness that can be controlled independently to challenge agents in different ways and obtain varying degrees of hardness in toy and complex RL environments. We consider and allow control over a wide variety of dimensions, including delayed rewards, sequence lengths, reward density, stochasticity, image representations, irrelevant features, time unit, action range and more. We define a parameterised collection of fast-to-run toy environments in OpenAI Gym by varying these dimensions and propose to use these to understand agents better. We then show how to design experiments using MDP Playground to gain insights on the toy environments. We also provide wrappers that can inject many of these dimensions into any Gym environment. We experiment with these wrappers on Atari and Mujoco to allow for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
