Variational Intrinsic Control

Karol Gregor; Danilo Jimenez Rezende; Daan Wierstra

arXiv:1611.07507·cs.LG·November 23, 2016·180 cites

Variational Intrinsic Control

Karol Gregor, Danilo Jimenez Rezende, Daan Wierstra

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel unsupervised reinforcement learning method that discovers intrinsic options by maximizing state reachability, providing a scalable approach with explicit empowerment measures for agents.

Contribution

Introduces two policy gradient algorithms for intrinsic option discovery, enabling scalable learning of diverse behaviors and explicit empowerment measurement.

Findings

01

Algorithms scale well with function approximation

02

Effective in various task environments

03

Provides explicit empowerment metrics

Abstract

In this paper we introduce a new unsupervised reinforcement learning method for discovering the set of intrinsic options available to an agent. This set is learned by maximizing the number of different states an agent can reliably reach, as measured by the mutual information between the set of options and option termination states. To this end, we instantiate two policy gradient based algorithms, one that creates an explicit embedding space of options and one that represents options implicitly. The algorithms also provide an explicit measure of empowerment in a given state that can be used by an empowerment maximizing agent. The algorithm scales well with function approximation and we demonstrate the applicability of the algorithm on a range of tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jbinas/gym-mnist
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Smart Grid Energy Management