Successor Feature Sets: Generalizing Successor Representations Across   Policies

Kiant\'e Brantley; Soroush Mehri; Geoffrey J. Gordon

arXiv:2103.02650·cs.LG·March 17, 2021

Successor Feature Sets: Generalizing Successor Representations Across Policies

Kiant\'e Brantley, Soroush Mehri, Geoffrey J. Gordon

PDF

Open Access 1 Video

TL;DR

This paper introduces a new, highly expressive successor-style representation that generalizes across policies, latent states, and reward functions, bridging model-based and model-free reinforcement learning.

Contribution

It develops a novel successor-style representation with a Bellman equation that integrates multiple information sources, enabling efficient policy and reward generalization.

Findings

01

Allows efficient reading off of optimal policies for new rewards

02

Enables imitation of new demonstrations

03

Generalizes POMDP value iteration to multiple information sources

Abstract

Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards. However, successor-style representations are not optimized to generalize across policies: typically, we maintain a limited-length list of policies, and share information among them by representation learning or GPI. Successor-style representations also typically make no provision for gathering information or reasoning about latent variables. To address these limitations, we bring together ideas from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Successor Feature Sets: Generalizing Successor Representations across Policies· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Advanced Bandit Algorithms Research