Policies Modulating Trajectory Generators

Atil Iscen; Ken Caluwaerts; Jie Tan; Tingnan Zhang; Erwin Coumans,; Vikas Sindhwani; Vincent Vanhoucke

arXiv:1910.02812·cs.RO·October 8, 2019·38 cites

Policies Modulating Trajectory Generators

Atil Iscen, Ken Caluwaerts, Jie Tan, Tingnan Zhang, Erwin Coumans,, Vikas Sindhwani, Vincent Vanhoucke

PDF

Open Access 3 Repos

TL;DR

This paper introduces a flexible architecture called Policies Modulating Trajectory Generators (PMTG) that combines simple policies with trajectory generators to learn controllable, periodic behaviors in robotics, demonstrated on quadrupedal locomotion.

Contribution

The paper presents a novel PMTG architecture that effectively integrates simple policies with trajectory generators for learning controllable behaviors, including real-world robot locomotion.

Findings

01

Controllable walking behaviors learned with simple linear policies and trajectory generators.

02

Speed-controlled locomotion achieved with under 1000 training rollouts.

03

Transfer of learned policies from simulation to real robot successfully.

Abstract

We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings