Successor Features for Transfer in Reinforcement Learning

Andr\'e Barreto; Will Dabney; R\'emi Munos; Jonathan J. Hunt; Tom; Schaul; Hado van Hasselt; David Silver

arXiv:1606.05312·cs.AI·April 13, 2018·178 cites

Successor Features for Transfer in Reinforcement Learning

Andr\'e Barreto, Will Dabney, R\'emi Munos, Jonathan J. Hunt, Tom, Schaul, Hado van Hasselt, David Silver

PDF

Open Access

TL;DR

This paper introduces a transfer learning framework in reinforcement learning using successor features and generalized policy improvement, enabling effective transfer across tasks with different rewards but identical dynamics, with theoretical guarantees and practical success.

Contribution

It presents a novel transfer method leveraging successor features and policy improvement, providing theoretical guarantees and demonstrating superior transfer performance in navigation and robotic control tasks.

Findings

01

Successfully promotes transfer across tasks with different rewards

02

Outperforms alternative methods in navigation tasks

03

Achieves significant improvements in robotic arm control

Abstract

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same. Our approach rests on two key ideas: "successor features", a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement", a generalization of dynamic programming's policy improvement operation that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Evolutionary Algorithms and Applications