Pre-training Neural Networks with Human Demonstrations for Deep   Reinforcement Learning

Gabriel V. de la Cruz Jr; Yunshu Du; Matthew E. Taylor

arXiv:1709.04083·cs.LG·April 9, 2019·26 cites

Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning

Gabriel V. de la Cruz Jr, Yunshu Du, Matthew E. Taylor

PDF

Open Access

TL;DR

This paper proposes pre-training neural networks with human demonstrations to improve feature learning in deep reinforcement learning, significantly reducing training time on Atari games.

Contribution

It introduces a supervised pre-training approach using human demonstrations to enhance feature learning in deep RL, leading to faster training.

Findings

01

Pre-training with human demonstrations outperforms naive pre-training in feature discovery.

02

Pre-trained models significantly reduce training time in deep RL.

03

Effective even with a small number of human demonstrations.

Abstract

Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using a deep neural network as its function approximator and by learning directly from raw images. A drawback of using raw images is that deep RL must learn the state feature representation from the raw images in addition to learning a policy. As a result, deep RL can require a prohibitively large amount of training time and data to reach reasonable performance, making it difficult to use deep RL in real-world applications, especially when data is expensive. In this work, we speed up training by addressing half of what deep RL is trying to solve --- learning features. Our approach is to learn some of the important features by pre-training deep RL network's hidden layers via supervised learning using a small set of human demonstrations. We empirically evaluate our approach using deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Dense Connections · Q-Learning · Deep Q-Network