Decoupling Representation Learning from Reinforcement Learning
Adam Stooke, Kimin Lee, Pieter Abbeel, and Michael Laskin

TL;DR
This paper introduces a novel unsupervised learning task called Augmented Temporal Contrast (ATC) to improve representation learning in deep reinforcement learning from images, decoupling it from policy learning and achieving superior performance.
Contribution
It proposes ATC, a new contrastive learning method for image representations in RL, and demonstrates its effectiveness across various benchmarks and tasks.
Findings
ATC-trained encoders outperform end-to-end RL encoders.
Pre-trained encoders with ATC improve RL agent performance.
Multi-task encoders generalize across different environments.
Abstract
In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning. To this end, we introduce a new unsupervised learning (UL) task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss. In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL in most environments. Additionally, we benchmark several leading UL algorithms by pre-training encoders on expert demonstrations and using them, with weights frozen, in RL agents; we find that agents using ATC-trained encoders outperform all others. We also train multi-task encoders on data from multiple environments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
