Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment

Tung M. Luu; Chang D. Yoo

arXiv:2110.15043·cs.LG·October 29, 2021

Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment

Tung M. Luu, Chang D. Yoo

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hindsight Goal Ranking (HGR), a prioritized replay method that improves learning efficiency in sparse reward environments by focusing on experiences with higher TD errors, leading to faster training.

Contribution

HGR is a novel replay sampling method that prioritizes experiences based on TD error, enhancing learning speed over uniform sampling in robotic tasks.

Findings

01

HGR accelerates learning significantly faster than uniform sampling.

02

HGR is more sample-efficient than previous methods across multiple robotic tasks.

03

Empirical results demonstrate improved training speed and efficiency.

Abstract

This paper proposes a method for prioritizing the replay experience referred to as Hindsight Goal Ranking (HGR) in overcoming the limitation of Hindsight Experience Replay (HER) that generates hindsight goals based on uniform sampling. HGR samples with higher probability on the states visited in an episode with larger temporal difference (TD) error, which is considered as a proxy measure of the amount which the RL agent can learn from an experience. The actual sampling for large TD error is performed in two steps: first, an episode is sampled from the relay buffer according to the average TD error of its experiences, and then, for the sampled episode, the hindsight goal leading to larger TD error is sampled with higher probability from future visited states. The proposed method combined with Deep Deterministic Policy Gradient (DDPG), an off-policy model-free actor-critic algorithm,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tunglm2203/hgr
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning

MethodsExperience Replay