Generalizable task representation learning from human demonstration   videos: a geometric approach

Jun Jin; Martin Jagersand

arXiv:2202.13604·cs.RO·March 1, 2022

Generalizable task representation learning from human demonstration videos: a geometric approach

Jun Jin, Martin Jagersand

PDF

Open Access

TL;DR

This paper introduces a geometric approach to learn generalizable task representations from human demonstration videos, enabling robots to perform tasks with different objects without additional training.

Contribution

It proposes CoVGS-IL, a graph-structured task function that encodes task geometry, allowing transfer to robot controllers without extra robot training or pre-recorded motions.

Findings

01

Enables task generalization across categorical objects.

02

Transfers learned representations to robot controllers via uncalibrated visual servoing.

03

Eliminates the need for extra robot training or pre-recorded motions.

Abstract

We study the problem of generalizable task learning from human demonstration videos without extra training on the robot or pre-recorded robot motions. Given a set of human demonstration videos showing a task with different objects/tools (categorical objects), we aim to learn a representation of visual observation that generalizes to categorical objects and enables efficient controller design. We propose to introduce a geometric task structure to the representation learning problem that geometrically encodes the task specification from human demonstration videos, and that enables generalization by building task specification correspondence between categorical objects. Specifically, we propose CoVGS-IL, which uses a graph-structured task function to learn task representations under structural constraints. Our method enables task generalization by selecting geometric features from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning