Recurrent Transformer Variational Autoencoders for Multi-Action Motion   Synthesis

Rania Briq; Chuhang Zou; Leonid Pishchulin; Chris Broaddus; Juergen; Gall

arXiv:2206.06741·cs.CV·June 28, 2022·1 cites

Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis

Rania Briq, Chuhang Zou, Leonid Pishchulin, Chris Broaddus, Juergen, Gall

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel Recurrent Transformer Variational Autoencoder model for synthesizing multi-action human motion sequences of arbitrary lengths, achieving smooth, realistic results with improved metrics.

Contribution

It presents an efficient, iterative approach combining Recurrent Transformers and conditional VAEs for multi-action motion synthesis, addressing limitations of prior single-action methods.

Findings

01

Significant improvements in FID score over state-of-the-art

02

Enhanced semantic consistency in generated sequences

03

Able to generate arbitrary-length multi-action sequences

Abstract

We consider the problem of synthesizing multi-action human motion sequences of arbitrary lengths. Existing approaches have mastered motion sequence generation in single action scenarios, but fail to generalize to multi-action and arbitrary-length sequences. We fill this gap by proposing a novel efficient approach that leverages expressiveness of Recurrent Transformers and generative richness of conditional Variational Autoencoders. The proposed iterative approach is able to generate smooth and realistic human motion sequences with an arbitrary number of actions and frames while doing so in linear space and time. We train and evaluate the proposed approach on PROX and Charades datasets, where we augment PROX with ground-truth action labels and Charades with human mesh annotations. Experimental evaluation shows significant improvements in FID score and semantic consistency metrics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

briqr/recurrent_transformer_vae_multi_action
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Advanced Vision and Imaging