AnchorDream: Repurposing Video Diffusion for Embodiment-Aware Robot Data Synthesis

Junjie Ye; Rong Xue; Basile Van Hoorick; Pavel Tokmakov; Muhammad Zubair Irshad; Yue Wang; Vitor Guizilini

arXiv:2512.11797·cs.RO·December 15, 2025

AnchorDream: Repurposing Video Diffusion for Embodiment-Aware Robot Data Synthesis

Junjie Ye, Rong Xue, Basile Van Hoorick, Pavel Tokmakov, Muhammad Zubair Irshad, Yue Wang, Vitor Guizilini

PDF

Open Access

TL;DR

AnchorDream leverages pretrained video diffusion models conditioned on robot motion to synthesize diverse, embodiment-consistent robot data, significantly enhancing imitation learning datasets without explicit environment modeling.

Contribution

The paper introduces AnchorDream, a novel embodiment-aware diffusion-based model that scales limited demonstrations into large, diverse datasets while maintaining motion plausibility.

Findings

01

36.4% improvement in simulator benchmarks

02

Nearly double performance in real-world tasks

03

Effective scaling from few demonstrations

Abstract

The collection of large-scale and diverse robot demonstrations remains a major bottleneck for imitation learning, as real-world data acquisition is costly and simulators offer limited diversity and fidelity with pronounced sim-to-real gaps. While generative models present an attractive solution, existing methods often alter only visual appearances without creating new behaviors, or suffer from embodiment inconsistencies that yield implausible motions. To address these limitations, we introduce AnchorDream, an embodiment-aware world model that repurposes pretrained video diffusion models for robot data synthesis. AnchorDream conditions the diffusion process on robot motion renderings, anchoring the embodiment to prevent hallucination while synthesizing objects and environments consistent with the robot's kinematics. Starting from only a handful of human teleoperation demonstrations, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · Social Robot Interaction and HRI