X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations

Maximus A. Pace; Prithwish Dan; Chuanruo Ning; Atiksh Bhardwaj; Audrey Du; Edward W. Duan; Wei-Chiu Ma; Kushal Kedia

arXiv:2511.04671·cs.RO·April 16, 2026

X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations

Maximus A. Pace, Prithwish Dan, Chuanruo Ning, Atiksh Bhardwaj, Audrey Du, Edward W. Duan, Wei-Chiu Ma, Kushal Kedia

PDF

1 Repo

TL;DR

X-Diffusion leverages ambient diffusion to learn robot policies from noisy human demonstrations, enabling effective cross-embodiment transfer without infeasible action replication.

Contribution

The paper introduces X-Diffusion, a novel framework that uses diffusion modeling to learn from noisy human actions, bridging embodiment gaps in robot learning.

Findings

01

X-Diffusion improves success rates by 16% over naive methods.

02

Effective use of human videos without manual filtering.

03

Applicable across five real-world manipulation tasks.

Abstract

Human videos are a scalable source of training data for robot learning. However, humans and robots significantly differ in embodiment, making many human actions infeasible for direct execution on a robot. Still, these demonstrations convey rich object-interaction cues and task intent. Our goal is to learn from this coarse guidance without transferring embodiment-specific, infeasible execution strategies. Recent advances in generative modeling tackle a related problem of learning from low-quality data. In particular, Ambient Diffusion is a recent method for diffusion modeling that incorporates low-quality data only at high-noise timesteps of the forward diffusion process. Our key insight is to view human actions as noisy counterparts of robot actions. As noise increases along the forward diffusion process, embodiment-specific differences fade away while task-relevant guidance is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://portal-cornell.github.io/X-Diffusion
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.