Visual Reinforcement Learning with Imagined Goals
Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin,, Sergey Levine

TL;DR
This paper introduces a visual reinforcement learning method enabling agents to learn general-purpose, goal-conditioned skills from raw images through self-supervised practice, goal relabeling, and efficient off-policy algorithms, demonstrated on real robots.
Contribution
It combines unsupervised representation learning with goal-conditioned reinforcement learning, introducing a retroactive goal relabeling scheme for improved sample efficiency.
Findings
Outperforms prior techniques in real-world robotic experiments
Learns policies directly from raw image observations and goals
Efficient off-policy learning with visual inputs
Abstract
For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires. Furthermore, to provide the requisite level of generality, these skills must handle raw sensory input such as images. In this paper, we propose an algorithm that acquires such general-purpose skills by combining unsupervised representation learning and reinforcement learning of goal-conditioned policies. Since the particular goals that might be required at test-time are not known in advance, the agent performs a self-supervised "practice" phase where it imagines goals and attempts to achieve them. We learn a visual representation with three distinct purposes: sampling goals for self-supervised practice, providing a structured transformation of raw sensory inputs, and computing a reward signal for goal reaching. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning
