Learning When to Switch: Composing Controllers to Traverse a Sequence of   Terrain Artifacts

Brendan Tidd; Nicolas Hudson; Akansel Cosgun; Jurgen Leitner

arXiv:2011.00440·cs.RO·September 30, 2021

Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Brendan Tidd, Nicolas Hudson, Akansel Cosgun, Jurgen Leitner

PDF

Open Access

TL;DR

This paper introduces a method for training multiple deep reinforcement learning policies for legged robots to traverse different terrains, and a switching network to select the appropriate policy, enabling better generalization and scalability.

Contribution

The authors develop a curriculum learning approach to create overlapping policies for terrain traversal and a switching network to select policies, improving adaptability over prior methods.

Findings

01

Switching network outperforms heuristic methods in unseen terrains.

02

Policies trained on individual terrains perform comparably to full-set training.

03

Method scales to many behaviors with embedded prior knowledge.

Abstract

Legged robots often use separate control policiesthat are highly engineered for traversing difficult terrain suchas stairs, gaps, and steps, where switching between policies isonly possible when the robot is in a region that is commonto adjacent controllers. Deep Reinforcement Learning (DRL)is a promising alternative to hand-crafted control design,though typically requires the full set of test conditions to beknown before training. DRL policies can result in complex(often unrealistic) behaviours that have few or no overlappingregions between adjacent policies, making it difficult to switchbehaviours. In this work we develop multiple DRL policieswith Curriculum Learning (CL), each that can traverse asingle respective terrain condition, while ensuring an overlapbetween policies. We then train a network for each destinationpolicy that estimates the likelihood of successfully switchingfrom…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Robot Manipulation and Learning