Data-Efficient Reinforcement Learning with Self-Predictive Representations
Max Schwarzer, Ankesh Anand, Rishab Goel, R Devon Hjelm, Aaron, Courville, Philip Bachman

TL;DR
This paper introduces Self-Predictive Representations (SPR), a self-supervised method that predicts future latent states to improve sample efficiency in deep reinforcement learning from pixel inputs, especially in limited data scenarios.
Contribution
SPR is a novel approach that combines future state prediction with data augmentation to enhance learning efficiency in pixel-based RL tasks, outperforming prior methods.
Findings
Achieves a median human-normalized score of 0.415 on Atari with 100k steps.
Outperforms previous state-of-the-art in sample-efficient deep RL from pixels.
Exceeds expert human scores on 7 out of 26 Atari games in limited data regime.
Abstract
While deep reinforcement learning excels at solving tasks where large amounts of data can be collected through virtually unlimited interaction with the environment, learning from limited interaction remains a key challenge. We posit that an agent can learn more efficiently if we augment reward maximization with self-supervised objectives based on structure in its visual input and sequential interaction with the environment. Our method, Self-Predictive Representations(SPR), trains an agent to predict its own latent state representations multiple steps into the future. We compute target representations for future states using an encoder which is an exponential moving average of the agent's parameters and we make predictions using a learned transition model. On its own, this future prediction objective outperforms prior methods for sample-efficient deep RL from pixels. We further improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
