TL;DR
This paper introduces switching successor measures and FB π-Switch, enabling hierarchical zero-shot reinforcement learning from a single learned representation without extra supervision or fixed horizons.
Contribution
It extends successor measures to allow hierarchical control in zero-shot RL, extracting subgoal and control policies from a unified representation.
Findings
FB π-Switch outperforms non-hierarchical baselines.
It matches state-of-the-art hierarchical methods in goal-conditioned tasks.
Structured successor representations enable flexible hierarchical RL.
Abstract
Hierarchical reinforcement learning can improve generalization by decomposing long-horizon decision-making into simpler subproblems. However, existing approaches often rely on restrictive design choices, such as fixed temporal abstractions or goal-conditioned objectives, which largely confine them to goal-reaching tasks and limit their applicability to general reward functions. In this paper, we introduce switching successor measures, an extension of successor measures that enables hierarchical control in zero-shot reinforcement learning without additional supervision, fixed horizons, or manually designed subgoals. We show that switching successor measures arise naturally from classical successor measures while preserving their underlying structure. Building on this result, we propose FB -Switch, an algorithm that extracts both a high-level subgoal-selection policy and a low-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
