DGFM: Full Body Dance Generation Driven by Music Foundation Models

Xinran Liu; Zhenhua Feng; Diptesh Kanojia; Wenwu Wang

arXiv:2502.20176·cs.SD·February 28, 2025

DGFM: Full Body Dance Generation Driven by Music Foundation Models

Xinran Liu, Zhenhua Feng, Diptesh Kanojia, Wenwu Wang

PDF

Open Access

TL;DR

This paper introduces a diffusion-based approach for full-body dance generation driven by music, combining music foundation model features with hand-crafted features to produce realistic dance sequences aligned with music.

Contribution

The paper presents a novel diffusion-based method that integrates high-level music features from foundation models with hand-crafted features for improved dance generation.

Findings

01

Achieves the most realistic dance sequences

02

Best match with input music among tested methods

03

Outperforms four music foundation models

Abstract

In music-driven dance motion generation, most existing methods use hand-crafted features and neglect that music foundation models have profoundly impacted cross-modal content generation. To bridge this gap, we propose a diffusion-based method that generates dance movements conditioned on text and music. Our approach extracts music features by combining high-level features obtained by music foundation model with hand-crafted features, thereby enhancing the quality of generated dance sequences. This method effectively leverages the advantages of high-level semantic information and low-level temporal details to improve the model's capability in music feature understanding. To show the merits of the proposed method, we compare it with four music foundation models and two sets of hand-crafted music features. The results demonstrate that our method obtains the most realistic dance sequences…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis