GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning
Xing Lei, Wenyan Yang, Kaiqiang Ke, Shentao Yang, Xuetao Zhang, Joni Pajarinen, Donglin Wang

TL;DR
This paper introduces Hindsight Goal-conditioned Regularization (HGR), a novel method that enhances sample efficiency in goal-conditioned reinforcement learning by generating action priors based on hindsight goals, outperforming existing techniques.
Contribution
The paper proposes HGR, a new regularization technique that, when combined with hindsight self-imitation, significantly improves sample efficiency in off-policy GCRL methods.
Findings
HGR achieves more efficient sample reuse than existing methods.
The combined approach outperforms prior GCRL techniques on navigation and manipulation tasks.
Empirical results demonstrate state-of-the-art performance with the proposed method.
Abstract
Goal-conditioned reinforcement learning (GCRL) with sparse rewards remains a fundamental challenge in reinforcement learning. While hindsight experience replay (HER) has shown promise by relabeling collected trajectories with achieved goals, we argue that trajectory relabeling alone does not fully exploit the available experiences in off-policy GCRL methods, resulting in limited sample efficiency. In this paper, we propose Hindsight Goal-conditioned Regularization (HGR), a technique that generates action regularization priors based on hindsight goals. When combined with hindsight self-imitation regularization (HSR), our approach enables off-policy RL algorithms to maximize experience utilization. Compared to existing GCRL methods that employ HER and self-imitation techniques, our hindsight regularizations achieve substantially more efficient sample reuse and the best performances, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
