Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen; Will Dabney; Andre Barreto; Tom Van de Wiele; David; Warde-Farley; Volodymyr Mnih

arXiv:1906.05030·cs.LG·January 28, 2020·22 cites

Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen, Will Dabney, Andre Barreto, Tom Van de Wiele, David, Warde-Farley, Volodymyr Mnih

PDF

Open Access

TL;DR

This paper introduces VISR, a new algorithm combining variational methods and successor features to enable rapid task inference and better generalization in reinforcement learning, validated on Atari games with limited reward exposure.

Contribution

VISR uniquely integrates variational intrinsic motivation with successor features, improving generalization and enabling fast task inference in reinforcement learning.

Findings

01

Achieved human-level performance on 14 Atari games

02

Outperformed all baseline methods in limited reward exposure setup

03

Validated effectiveness of VISR in diverse environments

Abstract

It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}. However, one limitation of this formulation is generalizing behaviors beyond the finite set being explicitly learned, as is needed for use on subsequent tasks. Successor features \citep{dayan93improving, barreto2017successor} provide an appealing solution to this generalization problem, but require defining the reward function as linear in some grounded feature space. In this paper, we show that these two techniques can be combined, and that each method solves the other's primary limitation. To do so we introduce Variational Intrinsic Successor FeatuRes (VISR), a novel algorithm which learns controllable features that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning