MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
Kaixing Yang, Xulong Tang, Ziqiao Peng, Yuxuan Hu, Jun He, Hongyan Liu

TL;DR
MEGADance introduces a novel mixture-of-experts architecture that enhances genre-aware 3D dance generation from music, improving synchronization, genre control, and dance quality through a two-stage process involving dance encoding and music-to-dance mapping.
Contribution
The paper presents a new architecture that decouples dance generality and genre specificity, utilizing a mixture-of-experts model with a hybrid transformer backbone for improved genre-aware dance synthesis.
Findings
Achieves state-of-the-art results on FineDance and AIST++ datasets.
Demonstrates strong genre controllability and dance quality.
Outperforms previous methods both qualitatively and quantitatively.
Abstract
Music-driven 3D dance generation has attracted increasing attention in recent years, with promising applications in choreography, virtual reality, and creative content creation. Previous research has generated promising realistic dance movement from audio signals. However, traditional methods underutilize genre conditioning, often treating it as auxiliary modifiers rather than core semantic drivers. This oversight compromises music-motion synchronization and disrupts dance genre continuity, particularly during complex rhythmic transitions, thereby leading to visually unsatisfactory effects. To address the challenge, we propose MEGADance, a novel architecture for music-driven 3D dance generation. By decoupling choreographic consistency into dance generality and genre specificity, MEGADance demonstrates significant dance quality and strong genre controllability. It consists of two stages:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Motion and Animation · Artificial Intelligence in Games · Human Pose and Action Recognition
