Improvements on Hindsight Learning

Ameet Deshpande; Srikanth Sarma; Ashutosh Jha; Balaraman Ravindran

arXiv:1809.06719·cs.LG·November 6, 2018·1 cites

Improvements on Hindsight Learning

Ameet Deshpande, Srikanth Sarma, Ashutosh Jha, Balaraman Ravindran

PDF

Open Access

TL;DR

This paper explores various replay strategies, especially prioritized replay, to enhance Hindsight Experience Replay in goal-directed reinforcement learning tasks, and demonstrates the application of Hindsight Policy Gradient methods to robotic tasks.

Contribution

It introduces prioritized replay variants for Hindsight Experience Replay and applies Hindsight Policy Gradient methods to robotic control tasks.

Findings

01

Prioritized replay improves learning efficiency in hindsight experience replay.

02

Hindsight Policy Gradient methods are effective in robotic tasks.

03

Enhanced replay strategies lead to better goal-conditioned policies.

Abstract

Sparse reward problems are one of the biggest challenges in Reinforcement Learning. Goal-directed tasks are one such sparse reward problems where a reward signal is received only when the goal is reached. One promising way to train an agent to perform goal-directed tasks is to use Hindsight Learning approaches. In these approaches, even when an agent fails to reach the desired goal, the agent learns to reach the goal it achieved instead. Doing this over multiple trajectories while generalizing the policy learned from the achieved goals, the agent learns a goal conditioned policy to reach any goal. One such approach is Hindsight Experience replay which uses an off-policy Reinforcement Learning algorithm to learn a goal conditioned policy. In this approach, a replay of the past transitions happens in a uniformly random fashion. Another approach is to use a Hindsight version of the policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Video Surveillance and Tracking Methods · Human Pose and Action Recognition

MethodsExperience Replay