ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space
Yashuai Yan, Esteve Valls Mascaro, Dongheui Lee

TL;DR
ImitationNet is an unsupervised deep learning framework that creates a shared latent space for human and robot poses, enabling accurate motion retargeting without paired data and supporting diverse input modalities.
Contribution
The paper introduces a novel unsupervised method for human-to-robot motion retargeting using shared latent space and adaptive contrastive learning, without requiring paired datasets.
Findings
Outperforms existing methods in efficiency and precision
Supports diverse input modalities like text, videos, and key poses
Successfully implemented on a real robot with collision avoidance
Abstract
This paper introduces a novel deep-learning approach for human-to-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-to-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation
MethodsContrastive Learning
