Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning
Daniele Reda, Tianxin Tao, Michiel van de Panne

TL;DR
This paper investigates how various environment design choices significantly influence the success and robustness of deep reinforcement learning policies for locomotion tasks, highlighting the importance of environment setup in RL performance.
Contribution
It systematically analyzes the impact of environment design factors on RL outcomes, emphasizing their role in the brittleness of learned locomotion policies.
Findings
Environment design choices greatly affect RL success.
Certain configurations improve policy robustness.
Design considerations are crucial for practical RL applications.
Abstract
Learning to locomote is one of the most common tasks in physics-based animation and deep reinforcement learning (RL). A learned policy is the product of the problem to be solved, as embodied by the RL environment, and the RL algorithm. While enormous attention has been devoted to RL algorithms, much less is known about the impact of design choices for the RL environment. In this paper, we show that environment design matters in significant ways and document how it can contribute to the brittle nature of many RL results. Specifically, we examine choices related to state representations, initial state distributions, reward structure, control frequency, episode termination procedures, curriculum usage, the action space, and the torque limits. We aim to stimulate discussion around such choices, which in practice strongly impact the success of RL when applied to continuous-action control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
