Which Features are Best for Successor Features?
Yann Ollivier

TL;DR
This paper identifies the optimal base features for successor features in reinforcement learning based on downstream performance, revealing they differ from Laplacian eigenfunctions and are consistent across various task types.
Contribution
It introduces a new criterion for selecting successor features that optimizes downstream performance without assuming linearity, and characterizes these features in deterministic environments.
Findings
Optimal features are the same across different task families.
These features are eigenfunctions of a modified Laplacian operator.
Results are derived under offline RL assumptions with large regularization.
Abstract
In reinforcement learning, universal successor features (SFs) are a way to provide zero-shot adaptation to new tasks at test time: they provide optimal policies for all downstream reward functions lying in the linear span of a set of base features. But it is unclear what constitutes a good set of base features, that could be useful for a wide set of downstream tasks beyond their linear span. Laplacian eigenfunctions (the eigenfunctions of with the Laplacian operator of some reference policy and that of the time-reversed dynamics) have been argued to play a role, and offer good empirical performance. Here, for the first time, we identify the optimal base features based on an objective criterion of downstream performance, in a non-tautological way without assuming the downstream tasks are linear in the features. We do this for three generic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsBalanced Selection · Sparse Evolutionary Training
