Learning to reinforcement learn
Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z, Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick

TL;DR
This paper introduces deep meta-reinforcement learning, enabling rapid adaptation to new tasks by training recurrent networks to learn separate RL algorithms within a deep learning framework.
Contribution
It extends recurrent meta-learning to reinforcement learning, allowing learned RL algorithms to adapt quickly and exploit domain structure, demonstrated through seven proof-of-concept experiments.
Findings
Recurrent networks can support meta-learning in RL settings.
Learned RL algorithms can differ from the original training algorithm.
The approach shows promise for scaling and neuroscience applications.
Abstract
In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Neural Networks and Reservoir Computing
