Listen to Dance: Music-driven choreography generation using Autoregressive Encoder-Decoder Network
Juheon Lee, Seohyun Kim, Kyogu Lee

TL;DR
This paper introduces a system that generates dance choreography from music using an autoregressive encoder-decoder network, effectively translating audio into natural dance movements.
Contribution
It presents a novel music-driven choreography generation method leveraging joint audio-video data and autoregressive modeling for the first time.
Findings
Generated dance motions are musically meaningful.
The system produces natural dance movements.
User study confirms the quality of generated choreography.
Abstract
Automatic choreography generation is a challenging task because it often requires an understanding of two abstract concepts - music and dance - which are realized in the two different modalities, namely audio and video, respectively. In this paper, we propose a music-driven choreography generation system using an auto-regressive encoder-decoder network. To this end, we first collect a set of multimedia clips that include both music and corresponding dance motion. We then extract the joint coordinates of the dancer from video and the mel-spectrogram of music from audio, and train our network using music-choreography pairs as input. Finally, a novel dance motion is generated at the inference time when only music is given as an input. We performed a user study for a qualitative evaluation of the proposed method, and the results show that the proposed model is able to generate musically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Human Motion and Animation · Human Pose and Action Recognition
