Natural Language Can Help Bridge the Sim2Real Gap
Albert Yu, Adeline Foote, Raymond Mooney, Roberto Mart\'in-Mart\'in

TL;DR
This paper introduces a novel approach using natural language descriptions to bridge the visual gap between simulation and real-world images, improving the transfer of robotic policies with less real data.
Contribution
The paper proposes leveraging language descriptions as a unifying signal to create domain-invariant visual representations for sim2real transfer, outperforming existing methods.
Findings
Outperforms prior sim2real methods by 25-40%.
Using language-based pretraining improves domain invariance.
Effective with limited real-world demonstrations.
Abstract
The main challenge in learning image-conditioned robotic policies is acquiring a visual representation conducive to low-level control. Due to the high dimensionality of the image space, learning a good visual representation requires a considerable amount of visual data. However, when learning in the real world, data is expensive. Sim2Real is a promising paradigm for overcoming data scarcity in the real-world target domain by using a simulator to collect large amounts of cheap data closely related to the target task. However, it is difficult to transfer an image-conditioned policy from sim to real when the domains are very visually dissimilar. To bridge the sim2real visual gap, we propose using natural language descriptions of images as a unifying signal across domains that captures the underlying task-relevant semantics. Our key insight is that if two image observations from different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Genetics, Bioinformatics, and Biomedical Research
MethodsContrastive Language-Image Pre-training
