Data-Efficient Reinforcement Learning with Self-Predictive   Representations

Max Schwarzer; Ankesh Anand; Rishab Goel; R Devon Hjelm; Aaron; Courville; Philip Bachman

arXiv:2007.05929·cs.LG·May 21, 2021·50 cites

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel, R Devon Hjelm, Aaron, Courville, Philip Bachman

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Self-Predictive Representations (SPR), a self-supervised method that predicts future latent states to improve sample efficiency in deep reinforcement learning from pixel inputs, especially in limited data scenarios.

Contribution

SPR is a novel approach that combines future state prediction with data augmentation to enhance learning efficiency in pixel-based RL tasks, outperforming prior methods.

Findings

01

Achieves a median human-normalized score of 0.415 on Atari with 100k steps.

02

Outperforms previous state-of-the-art in sample-efficient deep RL from pixels.

03

Exceeds expert human scores on 7 out of 26 Atari games in limited data regime.

Abstract

While deep reinforcement learning excels at solving tasks where large amounts of data can be collected through virtually unlimited interaction with the environment, learning from limited interaction remains a key challenge. We posit that an agent can learn more efficiently if we augment reward maximization with self-supervised objectives based on structure in its visual input and sequential interaction with the environment. Our method, Self-Predictive Representations(SPR), trains an agent to predict its own latent state representations multiple steps into the future. We compute target representations for future states using an encoder which is an exponential moving average of the agent's parameters and we make predictions using a learned transition model. On its own, this future prediction objective outperforms prior methods for sample-efficient deep RL from pixels. We further improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mila-iqia/spr
pytorchOfficial

Videos

Data-Efficient Reinforcement Learning with Self-Predictive Representations· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications