LS-GAN: Human Motion Synthesis with Latent-space GANs

Avinash Amballa; Gayathri Akkinapalli; Vinitra Muralikrishnan

arXiv:2501.01449·cs.CV·January 6, 2025

LS-GAN: Human Motion Synthesis with Latent-space GANs

Avinash Amballa, Gayathri Akkinapalli, Vinitra Muralikrishnan

PDF

Open Access

TL;DR

This paper introduces a latent-space GAN framework for human motion synthesis conditioned on text, achieving faster training and inference with high-quality results comparable to diffusion models.

Contribution

The paper presents a novel latent-space GAN approach for text-conditioned human motion synthesis, reducing computational costs while maintaining high-quality outputs.

Findings

01

Achieved a FID of 0.482 on benchmarks.

02

Reduced FLOPs by over 91% compared to diffusion models.

03

Demonstrated competitive results with state-of-the-art methods.

Abstract

Human motion synthesis conditioned on textual input has gained significant attention in recent years due to its potential applications in various domains such as gaming, film production, and virtual reality. Conditioned Motion synthesis takes a text input and outputs a 3D motion corresponding to the text. While previous works have explored motion synthesis using raw motion data and latent space representations with diffusion models, these approaches often suffer from high training and inference times. In this paper, we introduce a novel framework that utilizes Generative Adversarial Networks (GANs) in the latent space to enable faster training and inference while achieving results comparable to those of the state-of-the-art diffusion methods. We perform experiments on the HumanML3D, HumanAct12 benchmarks and demonstrate that a remarkably simple GAN in the latent space achieves a FID of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · 3D Shape Modeling and Analysis

MethodsSoftmax · Attention Is All You Need · Diffusion