Distral: Robust Multitask Reinforcement Learning

Yee Whye Teh; Victor Bapst; Wojciech Marian Czarnecki; John Quan,; James Kirkpatrick; Raia Hadsell; Nicolas Heess; Razvan Pascanu

arXiv:1707.04175·cs.LG·July 14, 2017·183 cites

Distral: Robust Multitask Reinforcement Learning

Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan,, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu

PDF

Open Access

TL;DR

Distral introduces a novel multitask reinforcement learning method that shares a distilled policy across tasks, improving data efficiency, stability, and transfer performance in complex environments.

Contribution

The paper proposes Distral, a new approach that shares a distilled policy among tasks, enhancing stability and transfer in multitask reinforcement learning.

Findings

01

Outperforms related methods in complex 3D environments

02

Supports efficient transfer across tasks

03

Offers more robust and stable learning process

Abstract

Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a "distilled" policy that captures common…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Machine Learning and ELM