MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation
Patrick Lancaster, Nicklas Hansen, Aravind Rajeswaran, Vikash Kumar

TL;DR
MoDem-V2 introduces a novel system enabling real-world robot manipulation directly from visual inputs, leveraging demonstrations and advanced model-based reinforcement learning to safely explore and learn contact-rich skills without environment instrumentation.
Contribution
This work presents the first demonstration-augmented visual model-based reinforcement learning system trained directly in the real world for contact-rich manipulation.
Findings
Successfully learned complex manipulation tasks in real-world settings
Demonstration-bootstrapping improves exploration efficiency
Key ingredients enable safe and effective real-world learning
Abstract
Robotic systems that aspire to operate in uninstrumented real-world environments must perceive the world directly via onboard sensing. Vision-based learning systems aim to eliminate the need for environment instrumentation by building an implicit understanding of the world based on raw pixels, but navigating the contact-rich high-dimensional search space from solely sparse visual reward signals significantly exacerbates the challenge of exploration. The applicability of such systems is thus typically restricted to simulated or heavily engineered environments since agent exploration in the real-world without the guidance of explicit state estimation and dense rewards can lead to unsafe behavior and safety faults that are catastrophic. In this study, we isolate the root causes behind these limitations to develop a system, called MoDem-V2, capable of learning contact-rich manipulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
