OpenDance: Multimodal Controllable 3D Dance Generation with Large-scale Internet Data

Jinlu Zhang; Zixi Kang; Libin Liu; Jianlong Chang; Qi Tian; Feng Gao; Yizhou Wang

arXiv:2506.07565·cs.CV·December 1, 2025

OpenDance: Multimodal Controllable 3D Dance Generation with Large-scale Internet Data

Jinlu Zhang, Zixi Kang, Libin Liu, Jianlong Chang, Qi Tian, Feng Gao, Yizhou Wang

PDF

Open Access

TL;DR

OpenDance introduces a large-scale, richly annotated dance dataset and a novel multimodal generative model that enables controllable, diverse, and realistic 3D dance synthesis conditioned on music, text, keypoints, or trajectories.

Contribution

The paper presents OpenDanceSet, a comprehensive dance dataset, and OpenDanceNet, a unified framework for multimodal, controllable 3D dance generation with high fidelity and diversity.

Findings

01

High-fidelity dance synthesis with diverse styles.

02

Effective control over spatial and stylistic conditions.

03

Robust cross-modal learning enabled by rich annotations.

Abstract

Music-driven 3D dance generation offers significant creative potential, yet practical applications demand versatile and multimodal control. As the highly dynamic and complex human motion covering various styles and genres, dance generation requires satisfying diverse conditions beyond just music (e.g., spatial trajectories, keyframe gestures, or style descriptions). However, the absence of a large-scale and richly annotated dataset severely hinders progress. In this paper, we build OpenDanceSet, an extensive human dance dataset comprising over 100 hours across 14 genres and 147 subjects. Each sample has rich annotations to facilitate robust cross-modal learning: 3D motion, paired music, 2D keypoints, trajectories, and expert-annotated text descriptions. Furthermore, we propose OpenDanceNet, a unified masked modeling framework for controllable dance generation, including a disentangled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis