Audio to Body Dynamics

Eli Shlizerman; Lucio M. Dery; Hayden Schoen; Ira; Kemelmacher-Shlizerman

arXiv:1712.09382·eess.AS·May 11, 2020

Audio to Body Dynamics

Eli Shlizerman, Lucio M. Dery, Hayden Schoen, Ira, Kemelmacher-Shlizerman

PDF

1 Repo

TL;DR

This paper introduces a novel method that predicts full-body skeleton movements from audio of violin or piano performances, enabling realistic avatar animations that mimic musicians' hand and arm motions.

Contribution

It is the first to demonstrate that natural body dynamics can be predicted from music audio using an LSTM network trained on online recital videos.

Findings

01

Successfully predicts arm and finger movements from audio.

02

Creates animated avatars that mimic musical performances.

03

First to show body dynamics can be inferred from music.

Abstract

We present a method that gets as input an audio of violin or piano playing, and outputs a video of skeleton predictions which are further used to animate an avatar. The key idea is to create an animation of an avatar that moves their hands similarly to how a pianist or violinist would do, just from audio. Aiming for a fully detailed correct arms and fingers motion is a goal, however, it's not clear if body movement can be predicted from music at all. In this paper, we present the first result that shows that natural body dynamics can be predicted at all. We built an LSTM network that is trained on violin and piano recital videos uploaded to the Internet. The predicted points are applied onto a rigged avatar to create the animation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/Audio2BodyDynamics
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory