Learning and reusing primitive behaviours to improve Hindsight   Experience Replay sample efficiency

Francisco Roldan Sanchez; Qiang Wang; David Cordova Bulens; Kevin; McGuinness; Stephen Redmond; Noel O'Connor

arXiv:2310.01827·cs.RO·November 21, 2023

Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Francisco Roldan Sanchez, Qiang Wang, David Cordova Bulens, Kevin, McGuinness, Stephen Redmond, Noel O'Connor

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method that leverages learned primitive behaviors to guide exploration in reinforcement learning, significantly improving sample efficiency and training speed in goal-based robotic tasks compared to standard HER.

Contribution

The paper proposes a novel approach that uses a critic network to selectively incorporate primitive behaviors, enhancing exploration and learning efficiency in HER-based reinforcement learning.

Findings

01

Faster learning of successful policies with primitive behavior guidance

02

Improved sample efficiency over standard HER

03

Reduced training time in block manipulation tasks

Abstract

Hindsight Experience Replay (HER) is a technique used in reinforcement learning (RL) that has proven to be very efficient for training off-policy RL-based agents to solve goal-based robotic manipulation tasks using sparse rewards. Even though HER improves the sample efficiency of RL-based agents by learning from mistakes made in past experiences, it does not provide any guidance while exploring the environment. This leads to very large training times due to the volume of experience required to train an agent using this replay strategy. In this paper, we propose a method that uses primitive behaviours that have been previously learned to solve simple tasks in order to guide the agent toward more rewarding actions during exploration while learning other more complex tasks. This guidance, however, is not executed by a manually designed curriculum, but rather using a critic network to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

franroldans/qmp-her
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning

MethodsExperience Replay