Synthetic Human Action Video Data Generation with Pose Transfer

Vaclav Knapp; Matyas Bohacek

arXiv:2506.09411·cs.CV·June 12, 2025

Synthetic Human Action Video Data Generation with Pose Transfer

Vaclav Knapp, Matyas Bohacek

PDF

Open Access

TL;DR

This paper introduces a pose transfer-based method for generating realistic synthetic human action videos, enhancing training data diversity and improving action recognition performance.

Contribution

It presents a novel pose transfer technique using controllable 3D Gaussian avatars and releases a new dataset with diverse human identities for research.

Findings

01

Improves action recognition accuracy on benchmark datasets.

02

Effectively scales few-shot datasets with diverse backgrounds.

03

Enhances data diversity for training models.

Abstract

In video understanding tasks, particularly those involving human motion, synthetic data generation often suffers from uncanny features, diminishing its effectiveness for training. Tasks such as sign language translation, gesture recognition, and human motion understanding in autonomous driving have thus been unable to exploit the full potential of synthetic data. This paper proposes a method for generating synthetic human action video data using pose transfer (specifically, controllable 3D Gaussian avatar models). We evaluate this method on the Toyota Smarthome and NTU RGB+D datasets and show that it improves performance in action recognition tasks. Moreover, we demonstrate that the method can effectively scale few-shot datasets, making up for groups underrepresented in the real training data and adding diverse backgrounds. We open-source the method along with RANDOM People, a dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Social Robot Interaction and HRI