Data-Efficient Hierarchical Goal-Conditioned Reinforcement Learning via Normalizing Flows
Shaswat Garg, Matin Moezzi, Brandon Da Silva

TL;DR
This paper introduces NF-HIQL, a novel hierarchical reinforcement learning framework using normalizing flows to improve data efficiency and policy expressivity in complex, long-horizon tasks, with strong theoretical guarantees and empirical performance.
Contribution
It proposes a new flow-based hierarchical implicit Q-learning framework that enhances policy expressivity and data efficiency, with theoretical analysis and empirical validation.
Findings
NF-HIQL outperforms prior methods in diverse tasks.
It demonstrates improved robustness under limited data.
The framework provides theoretical guarantees on stability and generalization.
Abstract
Hierarchical goal-conditioned reinforcement learning (H-GCRL) provides a powerful framework for tackling complex, long-horizon tasks by decomposing them into structured subgoals. However, its practical adoption is hindered by poor data efficiency and limited policy expressivity, especially in offline or data-scarce regimes. In this work, Normalizing flow-based hierarchical implicit Q-learning (NF-HIQL), a novel framework that replaces unimodal gaussian policies with expressive normalizing flow policies at both the high- and low-levels of the hierarchy is introduced. This design enables tractable log-likelihood computation, efficient sampling, and the ability to model rich multimodal behaviors. New theoretical guarantees are derived, including explicit KL-divergence bounds for Real-valued non-volume preserving (RealNVP) policies and PAC-style sample efficiency results, showing that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Muscle activation and electromyography studies
