InfiniteDance: Scalable 3D Dance Generation Towards in-the-wild Generalization

Ronghui Li; Zhongyuan Hu; Li Siyao; Youliang Zhang; Haozhe Xie; Mingyuan Zhang; Jie Guo; Xiu Li; Ziwei Liu

arXiv:2603.13375·cs.CV·March 17, 2026

InfiniteDance: Scalable 3D Dance Generation Towards in-the-wild Generalization

Ronghui Li, Zhongyuan Hu, Li Siyao, Youliang Zhang, Haozhe Xie, Mingyuan Zhang, Jie Guo, Xiu Li, Ziwei Liu

PDF

Open Access

TL;DR

This paper introduces a scalable approach for 3D dance generation that generalizes well to in-the-wild scenarios by combining high-quality data reconstruction and advanced model design, including retrieval-augmented generation and rhythm adaptation.

Contribution

It presents a novel pipeline for reconstructing high-fidelity 3D dance data and a scalable LLaMA-based model with retrieval and MoE modules for improved generalization and rhythm adaptation.

Findings

01

Outperforms existing methods in qualitative evaluations

02

Produces diverse, physically plausible 3D dances from unseen music

03

Demonstrates robustness across various dance genres

Abstract

Although existing 3D dance generation methods perform well in controlled scenarios, they often struggle to generalize in the wild. When conditioned on unseen music, existing methods often produce unstructured or physically implausible dance, largely due to limited music-to-dance data and restricted model capacity. This work aims to push the frontier of generalizable 3D dance generation by scaling up both data and model design. (1) On the data side, we develop a fully automated pipeline that reconstructs high-fidelity 3D dance motions from monocular videos. To eliminate the physical artifacts prevalent in existing reconstruction methods, we introduce a Foot Restoration Diffusion Model (FRDM) guided by foot-contact and geometric constraints that enforce physical plausibility while preserving kinematic smoothness and expressiveness, resulting in a diverse, high-quality multimodal 3D dance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis