Offline Goal-Conditioned Reinforcement Learning with Projective Quasimetric Planning

Anthony Kobanda; Waris Radji; Mathieu Petitbois; Odalric-Ambrym Maillard; R\'emy Portelas

arXiv:2506.18847·cs.LG·February 2, 2026

Offline Goal-Conditioned Reinforcement Learning with Projective Quasimetric Planning

Anthony Kobanda, Waris Radji, Mathieu Petitbois, Odalric-Ambrym Maillard, R\'emy Portelas

PDF

TL;DR

This paper introduces ProQ, a geometric framework for offline goal-conditioned reinforcement learning that learns an asymmetric distance to improve long-horizon goal reaching by guiding agents through meaningful sub-goals.

Contribution

ProQ combines metric learning, keypoint coverage, and goal-conditioned control into a unified approach, addressing long-horizon challenges in offline RL.

Findings

01

Effective in diverse navigation benchmarks

02

Produces meaningful sub-goals for long-horizon tasks

03

Robustly drives goal-reaching in complex environments

Abstract

Offline Goal-Conditioned Reinforcement Learning seeks to train agents to reach specified goals from previously collected trajectories. Scaling that promises to long-horizon tasks remains challenging, notably due to compounding value-estimation errors. Principled geometric offers a potential solution to address these issues. Following this insight, we introduce Projective Quasimetric Planning (ProQ), a compositional framework that learns an asymmetric distance and then repurposes it, firstly as a repulsive energy forcing a sparse set of keypoints to uniformly spread over the learned latent space, and secondly as a structured directional cost guiding towards proximal sub-goals. In particular, ProQ couples this geometry with a Lagrangian out-of-distribution detector to ensure the learned keypoints stay within reachable areas. By unifying metric learning, keypoint coverage, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training