You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration
Bowen Wen, Wenzhao Lian, Kostas Bekris, Stefan Schaal

TL;DR
This paper introduces a novel category-level manipulation framework that learns from a single demonstration video, enabling robust, generalizable, and precise industrial manipulation tasks without manual programming.
Contribution
It proposes a simulation-trained, object-centric representation and a model-free 6 DoF motion tracking approach for single-demo learning of complex manipulation tasks.
Findings
Effective in high-precision industrial assembly tasks
Robust against dynamic uncertainties
Generalizes across object instances and scene configurations
Abstract
Promising results have been achieved recently in category-level manipulation that generalizes across object instances. Nevertheless, it often requires expensive real-world data collection and manual specification of semantic keypoints for each object category and task. Additionally, coarse keypoint predictions and ignoring intermediate action sequences hinder adoption in complex manipulation tasks beyond pick-and-place. This work proposes a novel, category-level manipulation framework that leverages an object-centric, category-level representation and model-free 6 DoF motion tracking. The canonical object representation is learned solely in simulation and then used to parse a category-level, task trajectory from a single demonstration video. The demonstration is reprojected to a target trajectory tailored to a novel object via the canonical representation. During execution, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Robotics and Sensor-Based Localization
