Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control
Glen Berseth, Cheng Xie, Paul Cernek, Michiel Van de Panne

TL;DR
This paper introduces PLAID, a progressive reinforcement learning approach that combines multiple skills into a single policy using distillation, input augmentation, and transfer learning, demonstrated on simulated bipedal locomotion tasks.
Contribution
The paper extends policy distillation to continuous actions and proposes a novel framework for incremental skill integration in reinforcement learning.
Findings
PLAID effectively merges multiple skills into a single policy.
The method outperforms three baseline approaches in simulated locomotion tasks.
Input injection and transfer learning enhance skill acquisition efficiency.
Abstract
Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. An open problem in this setting is that of developing good strategies for integrating or merging policies for multiple skills, where each individual skill is a specialist in a specific skill and its associated state distribution. We extend policy distillation methods to the continuous action setting and leverage this technique to combine expert policies, as evaluated in the domain of simulated bipedal locomotion across different classes of terrain. We also introduce an input injection method for augmenting an existing policy network to exploit new input features. Lastly, our method uses transfer learning to assist in the efficient acquisition of new skills. The combination of these methods allows a policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Robot Manipulation and Learning
