Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning
Ruozi Huang, Huang Hu, Wei Wu, Kei Sawada, Mi Zhang, Daxin Jiang

TL;DR
This paper introduces a novel sequence-to-sequence model with curriculum learning for long-term dance generation from music, effectively capturing music-dance correspondence and reducing error accumulation in long sequences.
Contribution
It proposes a new seq2seq architecture combined with curriculum learning to improve long-term dance synthesis from music, addressing error propagation and style consistency.
Findings
Outperforms existing methods on automatic metrics
Achieves higher human evaluation scores
Effectively models long-term music-dance alignment
Abstract
Dancing to music is one of human's innate abilities since ancient times. In machine learning research, however, synthesizing dance movements from music is a challenging problem. Recently, researchers synthesize human motion sequences through autoregressive models like recurrent neural network (RNN). Such an approach often generates short sequences due to an accumulation of prediction errors that are fed back into the neural network. This problem becomes even more severe in the long motion sequence generation. Besides, the consistency between dance and music in terms of style, rhythm and beat is yet to be taken into account during modeling. In this paper, we formalize the music-conditioned dance generation as a sequence-to-sequence learning problem and devise a novel seq2seq architecture to efficiently process long sequences of music features and capture the fine-grained correspondence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Music and Audio Processing
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Residual Connection · Label Smoothing · Multi-Head Attention · Sequence to Sequence
