OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation

Yiren Song; Xiyao Deng; Pei Yang; Yihan Wang; Mike Zheng Shou

arXiv:2605.12038·cs.CV·May 13, 2026

OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation

Yiren Song, Xiyao Deng, Pei Yang, Yihan Wang, Mike Zheng Shou

PDF

1 Repo

TL;DR

OmniHumanoid is a novel framework for cross-embodiment video generation that separates motion transfer from embodiment adaptation, enabling scalable, high-fidelity motion synthesis across diverse humanoid robots using unpaired data.

Contribution

It introduces a factorized approach with shared motion models and lightweight embodiment-specific adapters, along with a new synthetic dataset for cross-embodiment learning.

Findings

01

Achieves high motion fidelity and embodiment consistency.

02

Enables adaptation to new embodiments without retraining the shared model.

03

Performs well on synthetic and real-world benchmarks.

Abstract

Cross-embodiment video generation aims to transfer motions across different humanoid embodiments, such as human-to-robot and robot-to-robot, enabling scalable data generation for embodied intelligence. A major challenge in this setting is that motion dynamics are partly transferable across embodiments, whereas appearance and morphology remain embodiment-specific. Existing approaches often entangle these factors, and many require paired data for every target embodiment, which limits scalability to new robots. We present OmniHumanoid, a framework that factorizes transferable motion learning and embodiment-specific adaptation. Our method learns a shared motion transfer model from motion-aligned paired videos spanning multiple embodiments, while adapting to a new embodiment using only unpaired videos through lightweight embodiment-specific adapters. To reduce interference between motion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

showlab/OmniHumanoid
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.