Minimizing Human Assistance: Augmenting a Single Demonstration for Deep   Reinforcement Learning

Abraham George; Alison Bartsch; and Amir Barati Farimani

arXiv:2209.11275·cs.LG·March 21, 2023

Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning

Abraham George, Alison Bartsch, and Amir Barati Farimani

PDF

Open Access

TL;DR

This paper introduces a method to reduce human involvement in reinforcement learning by augmenting a single human demonstration to improve training efficiency and enable solving complex tasks, using minimal human input.

Contribution

The authors propose a novel demonstration augmentation technique that enhances RL training with only one human example, significantly reducing human effort while maintaining performance benefits.

Findings

01

Augmentation with a single demonstration improves training speed.

02

Method enables solving complex tasks like block stacking.

03

Agent often learns policies different from the human demonstration.

Abstract

The use of human demonstrations in reinforcement learning has proven to significantly improve agent performance. However, any requirement for a human to manually 'teach' the model is somewhat antithetical to the goals of reinforcement learning. This paper attempts to minimize human involvement in the learning process while retaining the performance advantages by using a single human example collected through a simple-to-use virtual reality simulation to assist with RL training. Our method augments a single demonstration to generate numerous human-like demonstrations that, when combined with Deep Deterministic Policy Gradients and Hindsight Experience Replay (DDPG + HER) significantly improve training time on simple tasks and allows the agent to solve a complex task (block stacking) that DDPG + HER alone cannot solve. The model achieves this significant training advantage using a single…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsWeight Decay · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Adam · Dense Connections · Experience Replay · Deep Deterministic Policy Gradient