AnimGAN: A Spatiotemporally-Conditioned Generative Adversarial Network for Character Animation
Maryam Sadat Mirzaei, Kourosh Meshgi, Etienne Frigo, Toyoaki Nishida

TL;DR
This paper introduces AnimGAN, a novel spatiotemporally-conditioned GAN that generates realistic, semantically relevant character animations with smooth dynamics, controlled by user-defined behaviors, trained on a large gesture dataset.
Contribution
The paper presents a new GAN architecture with LSTM and graph ConvNet components for generating controllable, realistic character animation sequences.
Findings
Produces plausible, realistic animations
Maintains semantic relevance and spatiotemporal smoothness
Outperforms traditional conditional GANs in quality
Abstract
Producing realistic character animations is one of the essential tasks in human-AI interactions. Considered as a sequence of poses of a humanoid, the task can be considered as a sequence generation problem with spatiotemporal smoothness and realism constraints. Additionally, we wish to control the behavior of AI agents by giving them what to do and, more specifically, how to do it. We proposed a spatiotemporally-conditioned GAN that generates a sequence that is similar to a given sequence in terms of semantics and spatiotemporal dynamics. Using LSTM-based generator and graph ConvNet discriminator, this system is trained end-to-end on a large gathered dataset of gestures, expressions, and actions. Experiments showed that compared to traditional conditional GAN, our method creates plausible, realistic, and semantically relevant humanoid animation sequences that match user expectations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis
