GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning

Xing Lei; Wenyan Yang; Kaiqiang Ke; Shentao Yang; Xuetao Zhang; Joni Pajarinen; Donglin Wang

arXiv:2508.06108·cs.LG·August 11, 2025

GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning

Xing Lei, Wenyan Yang, Kaiqiang Ke, Shentao Yang, Xuetao Zhang, Joni Pajarinen, Donglin Wang

PDF

Open Access

TL;DR

This paper introduces Hindsight Goal-conditioned Regularization (HGR), a novel method that enhances sample efficiency in goal-conditioned reinforcement learning by generating action priors based on hindsight goals, outperforming existing techniques.

Contribution

The paper proposes HGR, a new regularization technique that, when combined with hindsight self-imitation, significantly improves sample efficiency in off-policy GCRL methods.

Findings

01

HGR achieves more efficient sample reuse than existing methods.

02

The combined approach outperforms prior GCRL techniques on navigation and manipulation tasks.

03

Empirical results demonstrate state-of-the-art performance with the proposed method.

Abstract

Goal-conditioned reinforcement learning (GCRL) with sparse rewards remains a fundamental challenge in reinforcement learning. While hindsight experience replay (HER) has shown promise by relabeling collected trajectories with achieved goals, we argue that trajectory relabeling alone does not fully exploit the available experiences in off-policy GCRL methods, resulting in limited sample efficiency. In this paper, we propose Hindsight Goal-conditioned Regularization (HGR), a technique that generates action regularization priors based on hindsight goals. When combined with hindsight self-imitation regularization (HSR), our approach enables off-policy RL algorithms to maximize experience utilization. Compared to existing GCRL methods that employ HER and self-imitation techniques, our hindsight regularizations achieve substantially more efficient sample reuse and the best performances, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research