TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

Ziyue Yang; Kaixing Yang; Xulong Tang

arXiv:2603.27314·cs.AI·March 31, 2026

TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

Ziyue Yang, Kaixing Yang, Xulong Tang

PDF

TL;DR

TokenDance is a novel two-stage framework for music-to-dance generation that uses dual-modality tokenization and a bidirectional generator to improve realism, diversity, and efficiency in dance synthesis.

Contribution

It introduces a dual-modality tokenization approach and a Bidirectional Mamba-based generator for coherent, high-quality, and fast music-to-dance synthesis, addressing dataset limitations.

Findings

01

Achieves state-of-the-art performance in dance quality and speed.

02

Effectively captures choreography-specific structures in music and dance.

03

Demonstrates strong generalization to diverse music styles.

Abstract

Music-to-dance generation has broad applications in virtual reality, dance education, and digital character animation. However, the limited coverage of existing 3D dance datasets confines current models to a narrow subset of music styles and choreographic patterns, resulting in poor generalization to real-world music. Consequently, generated dances often become overly simplistic and repetitive, substantially degrading expressiveness and realism. To tackle this problem, we present TokenDance, a two-stage music-to-dance generation framework that explicitly addresses this limitation through dual-modality tokenization and efficient token-level generation. In the first stage, we discretize both dance and music using Finite Scalar Quantization, where dance motions are factorized into upper and lower-body components with kinematic-dynamic constraints, and music is decomposed into semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.