DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling

Anindita Ghosh; Bing Zhou; Rishabh Dabral; Jian Wang; Vladislav Golyanik; Christian Theobalt; Philipp Slusallek; Chuan Guo

arXiv:2506.18680·cs.GR·June 24, 2025

DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling

Anindita Ghosh, Bing Zhou, Rishabh Dabral, Jian Wang, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek, Chuan Guo

PDF

1 Repo

TL;DR

DuetGen is a hierarchical masked modeling framework that generates synchronized two-person dances from music by encoding motions into discrete tokens and using transformers to produce realistic, interactive dance sequences.

Contribution

It introduces a novel hierarchical token-based approach with masked transformers for music-driven two-person dance generation, capturing complex interactions effectively.

Findings

01

Achieves state-of-the-art motion realism and synchronization.

02

Effectively models intricate partner interactions.

03

Demonstrates versatility across various dance genres.

Abstract

We present DuetGen, a novel framework for generating interactive two-person dances from music. The key challenge of this task lies in the inherent complexities of two-person dance interactions, where the partners need to synchronize both with each other and with the music. Inspired by the recent advances in motion synthesis, we propose a two-stage solution: encoding two-person motions into discrete tokens and then generating these tokens from music. To effectively capture intricate interactions, we represent both dancers' motions as a unified whole to learn the necessary motion tokens, and adopt a coarse-to-fine learning strategy in both the stages. Our first stage utilizes a VQ-VAE that hierarchically separates high-level semantic features at a coarse temporal resolution from low-level details at a finer resolution, producing two discrete token sequences at different abstraction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anindita127/duetgen
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsVQ-VAE · ADaptive gradient method with the OPTimal convergence rate