Policies Modulating Trajectory Generators
Atil Iscen, Ken Caluwaerts, Jie Tan, Tingnan Zhang, Erwin Coumans,, Vikas Sindhwani, Vincent Vanhoucke

TL;DR
This paper introduces a flexible architecture called Policies Modulating Trajectory Generators (PMTG) that combines simple policies with trajectory generators to learn controllable, periodic behaviors in robotics, demonstrated on quadrupedal locomotion.
Contribution
The paper presents a novel PMTG architecture that effectively integrates simple policies with trajectory generators for learning controllable behaviors, including real-world robot locomotion.
Findings
Controllable walking behaviors learned with simple linear policies and trajectory generators.
Speed-controlled locomotion achieved with under 1000 training rollouts.
Transfer of learned policies from simulation to real robot successfully.
Abstract
We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
