Curious Representation Learning for Embodied Intelligence
Yilun Du, Chuang Gan, Phillip Isola

TL;DR
This paper introduces Curious Representation Learning (CRL), a framework where an agent learns visual representations through environment exploration driven by a reinforcement learning policy that maximizes representation error, leading to improved downstream navigation performance.
Contribution
CRL jointly trains a reinforcement learning policy and a visual representation model, enabling unsupervised learning from environment exploration and transfer to real-world tasks.
Findings
CRL outperforms or matches ImageNet pretraining on navigation tasks.
Learned representations transfer effectively from simulation to real images.
The approach enables interpretable results on real-world data.
Abstract
Self-supervised representation learning has achieved remarkable success in recent years. By subverting the need for supervised labels, such approaches are able to utilize the numerous unlabeled images that exist on the Internet and in photographic datasets. Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn not only from datasets but also learn from environments. An agent in a natural environment will not typically be fed curated data. Instead, it must explore its environment to acquire the data it will learn from. We propose a framework, curious representation learning (CRL), which jointly learns a reinforcement learning policy and a visual representation model. The policy is trained to maximize the error of the representation learner, and in doing so is incentivized to explore its environment. At the same time, the learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
