DanceChat: Large Language Model-Guided Music-to-Dance Generation
Qing Wang, Xiaohang Yang, Yilan Dong, Naveen Raj Govindaraj, Gregory Slabaugh, Shanxin Yuan

TL;DR
DanceChat introduces a novel LLM-guided approach for music-to-dance generation, using textual instructions to improve diversity and alignment with musical styles, addressing the semantic gap and data scarcity issues.
Contribution
The paper proposes a new LLM-guided framework that generates explicit textual dance instructions to enhance diversity and style alignment in music-to-dance synthesis.
Findings
Outperforms state-of-the-art methods in qualitative and quantitative evaluations
Generates more diverse dance movements aligned with musical styles
Effectively incorporates textual guidance to improve dance synthesis
Abstract
Music-to-dance generation aims to synthesize human dance motion conditioned on musical input. Despite recent progress, significant challenges remain due to the semantic gap between music and dance motion, as music offers only abstract cues, such as melody, groove, and emotion, without explicitly specifying the physical movements. Moreover, a single piece of music can produce multiple plausible dance interpretations. This one-to-many mapping demands additional guidance, as music alone provides limited information for generating diverse dance movements. The challenge is further amplified by the scarcity of paired music and dance data, which restricts the model\^a\u{A}\'Zs ability to learn diverse dance patterns. In this paper, we introduce DanceChat, a Large Language Model (LLM)-guided music-to-dance generation approach. We use an LLM as a choreographer that provides textual motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
