Transporters with Visual Foresight for Solving Unseen Rearrangement   Tasks

Hongtao Wu; Jikai Ye; Xin Meng; Chris Paxton; Gregory Chirikjian

arXiv:2202.10765·cs.RO·July 28, 2022·1 cites

Transporters with Visual Foresight for Solving Unseen Rearrangement Tasks

Hongtao Wu, Jikai Ye, Xin Meng, Chris Paxton, Gregory Chirikjian

PDF

Open Access

TL;DR

This paper introduces Transporters with Visual Foresight, a model enabling robotic systems to learn and generalize unseen rearrangement tasks efficiently using visual foresight and multi-modal action proposals, significantly improving success rates.

Contribution

The paper presents a novel visual foresight model combined with a multi-modal action proposal module, enabling zero-shot generalization to unseen rearrangement tasks with minimal data.

Findings

01

Success rate on unseen tasks improved from 55.4% to 78.5% in simulation.

02

Success rate on real robots increased from 30% to 63.3%.

03

Model learns effectively from only tens of demonstrations.

Abstract

Rearrangement tasks have been identified as a crucial challenge for intelligent robotic manipulation, but few methods allow for precise construction of unseen structures. We propose a visual foresight model for pick-and-place rearrangement manipulation which is able to learn efficiently. In addition, we develop a multi-modal action proposal module which builds on the Goal-Conditioned Transporter Network, a state-of-the-art imitation learning method. Our image-based task planning method, Transporters with Visual Foresight, is able to learn from only a handful of data and generalize to multiple unseen tasks in a zero-shot manner. TVF is able to improve the performance of a state-of-the-art imitation learning method on unseen tasks in simulation and real robot experiments. In particular, the average success rate on unseen tasks improves from 55.4% to 78.5% in simulation experiments and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning