Learning to Locomote: Understanding How Environment Design Matters for   Deep Reinforcement Learning

Daniele Reda; Tianxin Tao; Michiel van de Panne

arXiv:2010.04304·cs.LG·October 12, 2020

Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning

Daniele Reda, Tianxin Tao, Michiel van de Panne

PDF

TL;DR

This paper investigates how various environment design choices significantly influence the success and robustness of deep reinforcement learning policies for locomotion tasks, highlighting the importance of environment setup in RL performance.

Contribution

It systematically analyzes the impact of environment design factors on RL outcomes, emphasizing their role in the brittleness of learned locomotion policies.

Findings

01

Environment design choices greatly affect RL success.

02

Certain configurations improve policy robustness.

03

Design considerations are crucial for practical RL applications.

Abstract

Learning to locomote is one of the most common tasks in physics-based animation and deep reinforcement learning (RL). A learned policy is the product of the problem to be solved, as embodied by the RL environment, and the RL algorithm. While enormous attention has been devoted to RL algorithms, much less is known about the impact of design choices for the RL environment. In this paper, we show that environment design matters in significant ways and document how it can contribute to the brittle nature of many RL results. Specifically, we examine choices related to state representations, initial state distributions, reward structure, control frequency, episode termination procedures, curriculum usage, the action space, and the torque limits. We aim to stimulate discussion around such choices, which in practice strongly impact the success of RL when applied to continuous-action control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.