K-VIL: Keypoints-based Visual Imitation Learning

Jianfeng Gao; Zhi Tao; No\'emie Jaquier; and Tamim Asfour

arXiv:2209.03277·cs.RO·July 26, 2023·1 cites

K-VIL: Keypoints-based Visual Imitation Learning

Jianfeng Gao, Zhi Tao, No\'emie Jaquier, and Tamim Asfour

PDF

Open Access 1 Repo

TL;DR

K-VIL introduces a method for robotic visual imitation that automatically extracts object-centric keypoints and geometric constraints from minimal demonstrations, enabling robust skill transfer in complex, real-world scenes.

Contribution

The paper presents a novel keypoint-based approach for visual imitation learning that works from a single demonstration and incrementally updates task representations.

Findings

01

Effective in cluttered scenes and viewpoint mismatches.

02

Capable of handling large object variations and new instances.

03

Works in one-shot and few-shot learning scenarios.

Abstract

Visual imitation learning provides efficient and intuitive solutions for robotic systems to acquire novel manipulation skills. However, simultaneously learning geometric task constraints and control policies from visual inputs alone remains a challenging problem. In this paper, we propose an approach for keypoint-based visual imitation (K-VIL) that automatically extracts sparse, object-centric, and embodiment-independent task representations from a small number of human demonstration videos. The task representation is composed of keypoint-based geometric constraints on principal manifolds, their associated local frames, and the movement primitives that are then needed for the task execution. Our approach is capable of extracting such task representations from a single demonstration video, and of incrementally updating them when new demonstrations become available. To reproduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.com/jianfenggaobit/kvil_public
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Robot Manipulation and Learning