ST-GDance++: A Scalable Spatial-Temporal Diffusion for Long-Duration Group Choreography
Jing Xu, Weiqiang Wang, Cunjian Chen, Jun Liu, Qiuhong Ke

TL;DR
ST-GDance++ is a scalable framework for long-duration group dance generation that efficiently models spatial-temporal interactions, reducing computational complexity and improving stability in multi-dancer choreography.
Contribution
It introduces a decoupled spatial-temporal diffusion approach with lightweight graph convolutions and an efficient attention mask for scalable, collision-aware group dance synthesis.
Findings
Achieves competitive quality with lower latency
Effectively models dense spatial-temporal interactions
Enables long-duration, stable choreography generation
Abstract
Group dance generation from music requires synchronizing multiple dancers while maintaining spatial coordination, making it highly relevant to applications such as film production, gaming, and animation. Recent group dance generation models have achieved promising generation quality, but they remain difficult to deploy in interactive scenarios due to bidirectional attention dependencies. As the number of dancers and the sequence length increase, the attention computation required for aligning music conditions with motion sequences grows quadratically, leading to reduced efficiency and increased risk of motion collisions. Effectively modeling dense spatial-temporal interactions is therefore essential, yet existing methods often struggle to capture such complexity, resulting in limited scalability and unstable multi-dancer coordination. To address these challenges, we propose ST-GDance++,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Artificial Intelligence in Games · Generative Adversarial Networks and Image Synthesis
