Successor Feature Sets: Generalizing Successor Representations Across Policies
Kiant\'e Brantley, Soroush Mehri, Geoffrey J. Gordon

TL;DR
This paper introduces a new, highly expressive successor-style representation that generalizes across policies, latent states, and reward functions, bridging model-based and model-free reinforcement learning.
Contribution
It develops a novel successor-style representation with a Bellman equation that integrates multiple information sources, enabling efficient policy and reward generalization.
Findings
Allows efficient reading off of optimal policies for new rewards
Enables imitation of new demonstrations
Generalizes POMDP value iteration to multiple information sources
Abstract
Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards. However, successor-style representations are not optimized to generalize across policies: typically, we maintain a limited-length list of policies, and share information among them by representation learning or GPI. Successor-style representations also typically make no provision for gathering information or reasoning about latent variables. To address these limitations, we bring together ideas from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Advanced Bandit Algorithms Research
