Learning to Generate Diverse Dance Motions with Transformer
Jiaman Li, Yihang Yin, Hang Chu, Yi Zhou, Tingwu Wang, Sanja Fidler,, Hao Li

TL;DR
This paper presents a novel transformer-based system for generating diverse, high-quality dance motions from music, leveraging a large YouTube-based dataset and new evaluation metrics, suitable for virtual concerts and professional animation.
Contribution
Introduces a two-stream motion transformer model trained on a large YouTube dance dataset, enabling flexible and diverse dance motion synthesis from music.
Findings
Outperforms state-of-the-art dance motion generation methods
Demonstrates effectiveness of online videos for training dance models
Provides high-quality dance animations for virtual events
Abstract
With the ongoing pandemic, virtual concerts and live events using digitized performances of musicians are getting traction on massive multiplayer online worlds. However, well choreographed dance movements are extremely complex to animate and would involve an expensive and tedious production process. In addition to the use of complex motion capture systems, it typically requires a collaborative effort between animators, dancers, and choreographers. We introduce a complete system for dance motion synthesis, which can generate complex and highly diverse dance sequences given an input music sequence. As motion capture data is limited for the range of dance motions and styles, we introduce a massive dance motion data set that is created from YouTube videos. We also present a novel two-stream motion transformer generative model, which can generate motion sequences with high flexibility. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Music and Audio Processing
