Music to Dance as Language Translation using Sequence Models

Andr\'e Correia; Lu\'is A. Alexandre

arXiv:2403.15569·cs.SD·October 18, 2024·1 cites

Music to Dance as Language Translation using Sequence Models

Andr\'e Correia, Lu\'is A. Alexandre

PDF

Open Access 1 Repo

TL;DR

This paper introduces MDLT, a sequence-to-sequence translation approach using Transformer and Mamba architectures to generate dance choreography from music, demonstrating high-quality results on robotic dance tasks.

Contribution

It presents a novel framing of choreography synthesis as a translation problem and introduces two architectures, Transformer and Mamba, for music-to-dance translation.

Findings

01

MDLT outperforms baseline metrics in realism and quality

02

Transformer and Mamba variants effectively generate dance from music

03

Method successfully applied to robotic arm and humanoid robots

Abstract

Synthesising appropriate choreographies from music remains an open problem. We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task. Our method leverages an existing data set to learn to translate sequences of audio into corresponding dance poses. We present two variants of MDLT: one utilising the Transformer architecture and the other employing the Mamba architecture. We train our method on AIST++ and PhantomDance data sets to teach a robotic arm to dance, but our method can be applied to a full humanoid robot. Evaluation metrics, including Average Joint Error and Fr\'echet Inception Distance, consistently demonstrate that, when given a piece of music, MDLT excels at producing realistic and high-quality choreography. The code can be found at github.com/meowatthemoon/MDLT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

meowatthemoon/mdlt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Human Motion and Animation · Natural Language Processing Techniques