Listen to Rhythm, Choose Movements: Autoregressive Multimodal Dance Generation via Diffusion and Mamba with Decoupled Dance Dataset

Oran Duan; Yinghua Shen; Yingzhu Lv; Luyang Jie; Yaxin Liu; Qiong Wu

arXiv:2601.03323·cs.GR·April 8, 2026

Listen to Rhythm, Choose Movements: Autoregressive Multimodal Dance Generation via Diffusion and Mamba with Decoupled Dance Dataset

Oran Duan, Yinghua Shen, Yingzhu Lv, Luyang Jie, Yaxin Liu, Qiong Wu

PDF

1 Repo

TL;DR

This paper introduces LRCM, a multimodal-guided diffusion framework for dance motion generation that achieves coherent, long-duration sequences by integrating audio, text, and motion data with a novel architecture.

Contribution

The work presents a new decoupling paradigm for dance datasets and a diffusion architecture with a Motion Temporal Mamba Module for improved long-sequence dance synthesis.

Findings

01

LRCM outperforms existing methods in quantitative metrics.

02

The framework supports diverse multimodal inputs.

03

LRCM generates smooth, long-duration dance sequences.

Abstract

Advances in generative models and sequence learning have greatly promoted research in dance motion generation, yet current methods still suffer from coarse semantic control and poor coherence in long sequences. In this work, we present Listen to Rhythm, Choose Movements (LRCM), a multimodal-guided diffusion framework supporting both diverse input modalities and autoregressive dance motion generation. We explore a feature decoupling paradigm for dance datasets and generalize it to the Motorica Dance dataset, separating motion capture data, audio rhythm, and professionally annotated global and local text descriptions. Our diffusion architecture integrates an audio-latent Conformer and a text-latent Cross-Conformer, and incorporates a Motion Temporal Mamba Module (MTMM) to enable smooth, long-duration autoregressive synthesis. Experimental results indicate that LRCM delivers strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://oranduanstudy.github.io/LRCM
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.