Learning Invariant Representations for Reinforcement Learning without Reconstruction
Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey, Levine

TL;DR
This paper introduces a method for learning invariant representations in reinforcement learning that focus on task-relevant information without relying on pixel reconstruction, improving robustness and generalization in complex visual environments.
Contribution
The paper proposes using bisimulation metrics to train encoders that produce representations invariant to irrelevant details, without reconstruction, advancing state-of-the-art in visual RL tasks.
Findings
Achieved SOTA performance on modified visual MuJoCo tasks with distractors.
Learned invariance to weather, clouds, and time of day in highway driving.
Demonstrated robustness and generalization through bisimulation-based representations.
Abstract
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction. Our goal is to learn representations that both provide for effective downstream control and invariance to task-irrelevant details. Bisimulation metrics quantify behavioral similarity between states in continuous MDPs, which we propose using to learn robust latent representations which encode only the task-relevant information from observations. Our method trains encoders such that distances in latent space equal bisimulation distances in state space. We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks, where the background is replaced with moving distractors and natural videos, while achieving SOTA performance. We also test a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
