Transflower: probabilistic autoregressive dance generation with   multimodal attention

Guillermo Valle-P\'erez; Gustav Eje Henter; Jonas Beskow; Andr\'e; Holzapfel; Pierre-Yves Oudeyer; Simon Alexanderson

arXiv:2106.13871·cs.SD·June 14, 2022

Transflower: probabilistic autoregressive dance generation with multimodal attention

Guillermo Valle-P\'erez, Gustav Eje Henter, Jonas Beskow, Andr\'e, Holzapfel, Pierre-Yves Oudeyer, Simon Alexanderson

PDF

TL;DR

This paper introduces Transflower, a probabilistic autoregressive model with multimodal attention for generating realistic dance movements conditioned on music, supported by a new large-scale 3D dance dataset.

Contribution

It presents a novel normalizing flow-based autoregressive architecture with multimodal transformer encoding for dance generation, and provides the largest 3D dance dataset for this task.

Findings

01

The model produces diverse, realistic dance movements matching music.

02

Multimodal attention improves dance-music alignment.

03

Probabilistic modeling enhances diversity in generated dances.

Abstract

Dance requires skillful composition of complex movements that follow rhythmic, tonal and timbral features of music. Formally, generating dance conditioned on a piece of music can be expressed as a problem of modelling a high-dimensional continuous motion signal, conditioned on an audio signal. In this work we make two contributions to tackle this problem. First, we present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context, using a multimodal transformer encoder. Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers. Using this dataset, we compare our new model against two baselines, via objective metrics and a user study, and show that both the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.