Action-Conditioned 3D Human Motion Synthesis with Transformer VAE

Mathis Petrovich; Michael J. Black; G\"ul Varol

arXiv:2104.05670·cs.CV·September 21, 2021·1 cites

Action-Conditioned 3D Human Motion Synthesis with Transformer VAE

Mathis Petrovich, Michael J. Black, G\"ul Varol

PDF

Open Access 2 Repos

TL;DR

This paper introduces ACTOR, a Transformer-based VAE model for generating diverse, action-conditioned 3D human motion sequences without initial poses, improving over previous methods and enabling applications like data augmentation and motion denoising.

Contribution

The paper proposes a novel Transformer-based VAE architecture for action-conditioned 3D human motion synthesis, capable of generating variable-length sequences without initial poses.

Findings

01

Outperforms state-of-the-art on NTU RGB+D, HumanAct12, UESTC datasets.

02

Enables data augmentation to improve action recognition.

03

Provides effective motion denoising capabilities.

Abstract

We tackle the problem of action-conditioned generation of realistic and diverse human motion sequences. In contrast to methods that complete, or extend, motion sequences, this task does not require an initial pose or sequence. Here we learn an action-aware latent representation for human motions by training a generative variational autoencoder (VAE). By sampling from this latent space and querying a certain duration through a series of positional encodings, we synthesize variable-length motion sequences conditioned on a categorical action. Specifically, we design a Transformer-based architecture, ACTOR, for encoding and decoding a sequence of parametric SMPL human body models estimated from action recognition datasets. We evaluate our approach on the NTU RGB+D, HumanAct12 and UESTC datasets and show improvements over the state of the art. Furthermore, we present two use cases: improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Generative Adversarial Networks and Image Synthesis