Flow as the Cross-Domain Manipulation Interface

Mengda Xu; Zhenjia Xu; Yinghao Xu; Cheng Chi; Gordon Wetzstein,; Manuela Veloso; Shuran Song

arXiv:2407.15208·cs.RO·October 7, 2024

Flow as the Cross-Domain Manipulation Interface

Mengda Xu, Zhenjia Xu, Yinghao Xu, Cheng Chi, Gordon Wetzstein,, Manuela Veloso, Shuran Song

PDF

Open Access 3 Reviews

TL;DR

Im2Flow2Act introduces a scalable framework that uses object flow as an interface to transfer manipulation skills from human demonstrations and simulation to real robots, reducing the need for real-world training data.

Contribution

The paper presents a novel approach combining human demonstration videos and simulated robot data to enable real-world robot manipulation without extensive real-world training.

Findings

01

Successfully manipulates rigid, articulated, and deformable objects in real-world settings.

02

Reduces sim-to-real gap by using object flow as the manipulation interface.

03

Scalable system that bypasses the need for robot teleoperation.

Abstract

We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real-world manipulation skills without the need of real-world robot training data. The key idea behind Im2Flow2Act is to use object flow as the manipulation interface, bridging domain gaps between different embodiments (i.e., human and robot) and training environments (i.e., real-world and simulated). Im2Flow2Act comprises two components: a flow generation network and a flow-conditioned policy. The flow generation network, trained on human demonstration videos, generates object flow from the initial scene image, conditioned on the task description. The flow-conditioned policy, trained on simulated robot play data, maps the generated object flow to robot actions to realize the desired object movements. By using flow as input, this policy can be directly deployed in the real world with a minimal…

Peer Reviews

Decision·CoRL 2024

Reviewer 01Rating 3Confidence 3

Reviewer 02Rating 2Confidence 3

Reviewer 03Rating 3Confidence 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics