Jointly Pre-training with Supervised, Autoencoder, and Value Losses for   Deep Reinforcement Learning

Gabriel V. de la Cruz Jr.; Yunshu Du; Matthew E. Taylor

arXiv:1904.02206·cs.LG·April 5, 2019·1 cites

Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning

Gabriel V. de la Cruz Jr., Yunshu Du, Matthew E. Taylor

PDF

Open Access 1 Repo

TL;DR

This paper introduces a joint pre-training strategy combining supervised, autoencoder, and value losses to incorporate human knowledge into Deep Reinforcement Learning, significantly improving learning efficiency and performance in Atari games.

Contribution

It proposes a novel pre-training method that jointly optimizes multiple losses, enhancing feature learning and accelerating DRL training with human demonstrations.

Findings

01

Pre-training improves Atari game performance with fewer interactions.

02

The method outperforms state-of-the-art algorithms in Pong and MsPacman.

03

Pre-training is lightweight and easy to implement.

Abstract

Deep Reinforcement Learning (DRL) algorithms are known to be data inefficient. One reason is that a DRL agent learns both the feature and the policy tabula rasa. Integrating prior knowledge into DRL algorithms is one way to improve learning efficiency since it helps to build helpful representations. In this work, we consider incorporating human knowledge to accelerate the asynchronous advantage actor-critic (A3C) algorithm by pre-training a small amount of non-expert human demonstrations. We leverage the supervised autoencoder framework and propose a novel pre-training strategy that jointly trains a weighted supervised classification loss, an unsupervised reconstruction loss, and an expected return loss. The resulting pre-trained model learns more useful features compared to independently training in supervised or unsupervised fashion. Our pre-training method drastically improved the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gabrieledcjr/DeepRL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsEntropy Regularization · Dense Connections · Softmax · Convolution · A3C