ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared   Latent Space

Yashuai Yan; Esteve Valls Mascaro; Dongheui Lee

arXiv:2309.05310·cs.RO·April 9, 2024

ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space

Yashuai Yan, Esteve Valls Mascaro, Dongheui Lee

PDF

Open Access

TL;DR

ImitationNet is an unsupervised deep learning framework that creates a shared latent space for human and robot poses, enabling accurate motion retargeting without paired data and supporting diverse input modalities.

Contribution

The paper introduces a novel unsupervised method for human-to-robot motion retargeting using shared latent space and adaptive contrastive learning, without requiring paired datasets.

Findings

01

Outperforms existing methods in efficiency and precision

02

Supports diverse input modalities like text, videos, and key poses

03

Successfully implemented on a real robot with collision avoidance

Abstract

This paper introduces a novel deep-learning approach for human-to-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-to-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation

MethodsContrastive Learning