Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning
Alexi Canesse, Mathieu Petitbois, Ludovic Denoyer, Sylvain Lamprier,, R\'emy Portelas

TL;DR
This paper introduces QPHIL, a hierarchical transformer-based offline RL method that uses space quantization for improved navigation, explicit trajectory stitching, and state-of-the-art performance in complex environments.
Contribution
It presents a novel hierarchical transformer approach with learned space quantization, enhancing offline RL for navigation by enabling explicit trajectory stitching and simplified planning.
Findings
Achieves state-of-the-art results in complex navigation tasks.
Enables explicit trajectory stitching through zone-level reasoning.
Simplifies planning with a discrete autoregressive model.
Abstract
Offline Reinforcement Learning (RL) has emerged as a powerful alternative to imitation learning for behavior modeling in various domains, particularly in complex navigation tasks. An existing challenge with Offline RL is the signal-to-noise ratio, i.e. how to mitigate incorrect policy updates due to errors in value estimates. Towards this, multiple works have demonstrated the advantage of hierarchical offline RL methods, which decouples high-level path planning from low-level path following. In this work, we present a novel hierarchical transformer-based approach leveraging a learned quantizer of the space. This quantization enables the training of a simpler zone-conditioned low-level policy and simplifies planning, which is reduced to discrete autoregressive prediction. Among other benefits, zone-level reasoning in planning enables explicit trajectory stitching rather than implicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Speech and dialogue systems
