Learning Multi-Step Manipulation Tasks from A Single Human Demonstration

Dingkun Guo

arXiv:2312.15346·cs.RO·January 5, 2024·1 cites

Learning Multi-Step Manipulation Tasks from A Single Human Demonstration

Dingkun Guo

PDF

Open Access

TL;DR

This paper introduces a system that learns multi-step manipulation tasks from a single human demonstration by translating human actions into robot primitives and identifying key object poses, demonstrating effective success rates in a dishwashing task.

Contribution

The novel system processes RGBD videos to convert human demonstrations into robot actions and handles human-robot differences, enabling learning from a single demonstration in unstructured environments.

Findings

01

Achieved 50-100% success per step in dishwashing tasks.

02

Up to 40% success rate for entire multi-step task.

03

Effective in unstructured, real-world kitchen scenarios.

Abstract

Learning from human demonstrations has exhibited remarkable achievements in robot manipulation. However, the challenge remains to develop a robot system that matches human capabilities and data efficiency in learning and generalizability, particularly in complex, unstructured real-world scenarios. We propose a system that processes RGBD videos to translate human actions to robot primitives and identifies task-relevant key poses of objects using Grounded Segment Anything. We then address challenges for robots in replicating human actions, considering the human-robot differences in kinematics and collision geometry. To test the effectiveness of our system, we conducted experiments focusing on manual dishwashing. With a single human demonstration recorded in a mockup kitchen, the system achieved 50-100% success for each step and up to a 40% success rate for the whole task with different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Robot Manipulation and Learning