MAGMA: Music Aligned Generative Motion Autodecoder

Sohan Anisetty; Amit Raj; James Hays

arXiv:2309.01202·cs.GR·September 6, 2023

MAGMA: Music Aligned Generative Motion Autodecoder

Sohan Anisetty, Amit Raj, James Hays

PDF

Open Access

TL;DR

This paper introduces MAGMA, a novel two-step music-to-dance generation model using VQ-VAE and Transformer, achieving state-of-the-art results and enabling real-time, long, and customizable dance sequence generation.

Contribution

The paper presents a new two-step approach combining VQ-VAE and Transformer for music-to-dance generation, improving sequence length, coherence, and customization capabilities.

Findings

01

Achieves state-of-the-art results on music-to-motion benchmarks.

02

Enables real-time generation of longer dance sequences.

03

Allows seamless chaining and style customization of generated dances.

Abstract

Mapping music to dance is a challenging problem that requires spatial and temporal coherence along with a continual synchronization with the music's progression. Taking inspiration from large language models, we introduce a 2-step approach for generating dance using a Vector Quantized-Variational Autoencoder (VQ-VAE) to distill motion into primitives and train a Transformer decoder to learn the correct sequencing of these primitives. We also evaluate the importance of music representations by comparing naive music feature extraction using Librosa to deep audio representations generated by state-of-the-art audio compression algorithms. Additionally, we train variations of the motion generator using relative and absolute positional encodings to determine the effect on generated motion quality when generating arbitrarily long sequence lengths. Our proposed approach achieve state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Music and Audio Processing · Music Technology and Sound Studies

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings