Music-to-Dance Generation with Optimal Transport

Shuang Wu; Shijian Lu; Li Cheng

arXiv:2112.01806·cs.SD·May 5, 2022

Music-to-Dance Generation with Optimal Transport

Shuang Wu, Shijian Lu, Li Cheng

PDF

Open Access

TL;DR

This paper introduces MDOT-Net, a novel neural network that generates realistic, diverse 3D dance choreographies from music by using optimal transport distances to improve training stability and music-dance correspondence.

Contribution

It proposes a new training framework employing optimal transport and Gromov-Wasserstein distances for music-to-dance generation, enhancing realism, diversity, and music alignment.

Findings

01

Generated dances are realistic and diverse.

02

Dance sequences match musical rhythm and style.

03

The method outperforms previous approaches in coherence and quality.

Abstract

Dance choreography for a piece of music is a challenging task, having to be creative in presenting distinctive stylistic dance elements while taking into account the musical theme and rhythm. It has been tackled by different approaches such as similarity retrieval, sequence-to-sequence modeling and generative adversarial networks, but their generated dance sequences are often short of motion realism, diversity and music consistency. In this paper, we propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographies from music. We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music. This gives a well defined and non-divergent training objective that mitigates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Music Technology and Sound Studies