Hindsight Experience Replay
Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel, Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba

TL;DR
Hindsight Experience Replay is a technique that enhances sample efficiency in reinforcement learning with sparse rewards by reusing past experiences with modified goals, demonstrated on robotic manipulation tasks.
Contribution
It introduces Hindsight Experience Replay, a method that improves learning efficiency in sparse reward settings and can be integrated with various off-policy RL algorithms.
Findings
Enables effective learning in environments with binary sparse rewards
Improves training efficiency and success rates in robotic manipulation tasks
Policies trained in simulation transfer successfully to physical robots
Abstract
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Hindsight Experience Replay | Two Minute Papers #192· youtube
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research
MethodsExperience Replay
