One-Shot Imitation under Mismatched Execution

Kushal Kedia; Prithwish Dan; Angela Chao; Maximus Adrian Pace,; Sanjiban Choudhury

arXiv:2409.06615·cs.RO·April 1, 2025

One-Shot Imitation under Mismatched Execution

Kushal Kedia, Prithwish Dan, Angela Chao, Maximus Adrian Pace,, Sanjiban Choudhury

PDF

Open Access

TL;DR

RHyME is a novel framework that enables robots to imitate human demonstrations by automatically pairing and synthesizing human videos from robot trajectories, overcoming data pairing and visual similarity challenges.

Contribution

The paper introduces RHyME, a sequence-level optimal transport-based method for cross-embodiment imitation that does not require paired data or frame-level visual similarity.

Findings

01

Achieves over 50% increase in task success rate.

02

Successfully imitates cross-embodiment demonstrations in simulation and real-world.

03

Facilitates policy training without paired human-robot data.

Abstract

Human demonstrations as prompts are a powerful way to program robots to do long-horizon manipulation tasks. However, translating these demonstrations into robot-executable actions presents significant challenges due to execution mismatches in movement styles and physical capabilities. Existing methods for human-robot translation either depend on paired data, which is infeasible to scale, or rely heavily on frame-level visual similarities that often break down in practice. To address these challenges, we propose RHyME, a novel framework that automatically pairs human and robot trajectories using sequence-level optimal transport cost functions. Given long-horizon robot demonstrations, RHyME synthesizes semantically equivalent human videos by retrieving and composing short-horizon human clips. This approach facilitates effective policy training without the need for paired data. RHyME…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques