Abstraction for Offline Goal-Conditioned Reinforcement Learning

Clarisse Wibault; Alexander Goldie; Antonio Villares; Maike Osborne; Jakob Foerster

arXiv:2605.22711·cs.LG·May 22, 2026

Abstraction for Offline Goal-Conditioned Reinforcement Learning

Clarisse Wibault, Alexander Goldie, Antonio Villares, Maike Osborne, Jakob Foerster

PDF

TL;DR

This paper introduces a hierarchical abstraction framework for offline goal-conditioned reinforcement learning, leveraging relativised options to improve experience reuse and performance.

Contribution

It proposes a novel hierarchy-based abstraction method with relativised options and demonstrates its effectiveness through two new algorithms.

Findings

01

Hierarchical abstraction improves offline GCRL performance.

02

Relativised options enable better experience reuse across contexts.

03

Algorithms based on this framework outperform baselines in experiments.

Abstract

Markov Decision Processes (MDPs) often exhibit significant redundancy due to symmetries and shared structure across state-goal pairs in real-world Goal-Conditioned Reinforcement Learning (GCRL). While hierarchical policies have been motivated for horizon reduction via temporal abstraction in offline GCRL, we demonstrate that hierarchy also enables absolute abstraction. By introducing relativised options as well as distinct representations for different levels of the hierarchy, we demonstrate how an agent can reuse experience across similar contexts of the state-space. Based on this framework, we introduce two simple algorithms for learning relativised options and abstracting from the absolute frame of reference. Our experiments show that such inductive biases significantly improve performance in offline GCRL.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.