Hindsight Experience Replay

Marcin Andrychowicz; Filip Wolski; Alex Ray; Jonas Schneider; Rachel; Fong; Peter Welinder; Bob McGrew; Josh Tobin; Pieter Abbeel; Wojciech Zaremba

arXiv:1707.01495·cs.LG·February 26, 2018·352 cites

Hindsight Experience Replay

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel, Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba

PDF

Open Access 5 Repos 1 Video

TL;DR

Hindsight Experience Replay is a technique that enhances sample efficiency in reinforcement learning with sparse rewards by reusing past experiences with modified goals, demonstrated on robotic manipulation tasks.

Contribution

It introduces Hindsight Experience Replay, a method that improves learning efficiency in sparse reward settings and can be integrated with various off-policy RL algorithms.

Findings

01

Enables effective learning in environments with binary sparse rewards

02

Improves training efficiency and success rates in robotic manipulation tasks

03

Policies trained in simulation transfer successfully to physical robots

Abstract

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Hindsight Experience Replay | Two Minute Papers #192· youtube

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research

MethodsExperience Replay