One-shot Visual Imitation via Attributed Waypoints and Demonstration   Augmentation

Matthew Chang; Saurabh Gupta

arXiv:2302.04856·cs.RO·February 10, 2023

One-shot Visual Imitation via Attributed Waypoints and Demonstration Augmentation

Matthew Chang, Saurabh Gupta

PDF

Open Access

TL;DR

This paper introduces a modular approach for one-shot visual imitation that separates task inference from execution, using data augmentation to improve generalization, achieving significant success rate improvements on benchmarks.

Contribution

The paper proposes a novel modular framework that separates task inference from execution and employs data augmentation to enhance one-shot visual imitation performance.

Findings

01

Achieved 100% success on one benchmark and 48% on another.

02

Improved state-of-the-art success rates by 90% and 20%.

03

Identified key errors in existing methods and addressed them effectively.

Abstract

In this paper, we analyze the behavior of existing techniques and design new solutions for the problem of one-shot visual imitation. In this setting, an agent must solve a novel instance of a novel task given just a single visual demonstration. Our analysis reveals that current methods fall short because of three errors: the DAgger problem arising from purely offline training, last centimeter errors in interacting with objects, and mis-fitting to the task context rather than to the actual task. This motivates the design of our modular approach where we a) separate out task inference (what to do) from task execution (how to do it), and b) develop data augmentation and generation techniques to mitigate mis-fitting. The former allows us to leverage hand-crafted motor primitives for task execution which side-steps the DAgger problem and last centimeter errors, while the latter gets the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Robot Manipulation and Learning