LARNet: Latent Action Representation for Human Action Synthesis

Naman Biyani; Aayush J Rana; Shruti Vyas; Yogesh S Rawat

arXiv:2110.10899·cs.CV·October 28, 2021

LARNet: Latent Action Representation for Human Action Synthesis

Naman Biyani, Aayush J Rana, Shruti Vyas, Yogesh S Rawat

PDF

Open Access 1 Repo

TL;DR

LARNet is an end-to-end model that synthesizes human action videos by learning action dynamics in latent space, eliminating the need for driving videos, and employs a hierarchical recurrent structure with a novel loss for improved temporal coherence.

Contribution

It introduces a generative approach for action dynamics in latent space and a hierarchical recurrent structure with a mix-adversarial loss for video synthesis.

Findings

01

Effective in generating realistic human action videos

02

Outperforms existing methods on multiple datasets

03

Improves temporal coherence in synthesized videos

Abstract

We present LARNet, a novel end-to-end approach for generating human action videos. A joint generative modeling of appearance and dynamics to synthesize a video is very challenging and therefore recent works in video synthesis have proposed to decompose these two factors. However, these methods require a driving video to model the video dynamics. In this work, we propose a generative approach instead, which explicitly learns action dynamics in latent space avoiding the need of a driving video during inference. The generated action dynamics is integrated with the appearance using a recurrent hierarchical structure which induces motion at different scales to focus on both coarse as well as fine level action details. In addition, we propose a novel mix-adversarial loss function which aims at improving the temporal coherency of synthesized videos. We evaluate the proposed approach on four…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aayushjr/larnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Human Motion and Animation