Flow as the Cross-Domain Manipulation Interface
Mengda Xu, Zhenjia Xu, Yinghao Xu, Cheng Chi, Gordon Wetzstein,, Manuela Veloso, Shuran Song

TL;DR
Im2Flow2Act introduces a scalable framework that uses object flow as an interface to transfer manipulation skills from human demonstrations and simulation to real robots, reducing the need for real-world training data.
Contribution
The paper presents a novel approach combining human demonstration videos and simulated robot data to enable real-world robot manipulation without extensive real-world training.
Findings
Successfully manipulates rigid, articulated, and deformable objects in real-world settings.
Reduces sim-to-real gap by using object flow as the manipulation interface.
Scalable system that bypasses the need for robot teleoperation.
Abstract
We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real-world manipulation skills without the need of real-world robot training data. The key idea behind Im2Flow2Act is to use object flow as the manipulation interface, bridging domain gaps between different embodiments (i.e., human and robot) and training environments (i.e., real-world and simulated). Im2Flow2Act comprises two components: a flow generation network and a flow-conditioned policy. The flow generation network, trained on human demonstration videos, generates object flow from the initial scene image, conditioned on the task description. The flow-conditioned policy, trained on simulated robot play data, maps the generated object flow to robot actions to realize the desired object movements. By using flow as input, this policy can be directly deployed in the real world with a minimal…
Peer Reviews
Decision·CoRL 2024
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
