Pose-Guided Human Animation from a Single Image in the Wild
Jae Shin Yoon, Lingjie Liu, Vladislav Golyanik, Kripasindhu Sarkar,, Hyun Soo Park, Christian Theobalt

TL;DR
This paper introduces a pose-guided human animation method from a single image that maintains identity and appearance over time, overcoming artifacts and inconsistencies of previous approaches.
Contribution
A compositional neural network predicts silhouette, labels, and textures separately, enabling robust, scene-independent human animation synthesis from a single image.
Findings
Outperforms state-of-the-art in synthesis quality
Achieves better temporal coherence
Generalizes well without fine-tuning
Abstract
We present a new pose transfer method for synthesizing a human animation from a single image of a person controlled by a sequence of body poses. Existing pose transfer methods exhibit significant visual artifacts when applying to a novel scene, resulting in temporal inconsistency and failures in preserving the identity and textures of the person. To address these limitations, we design a compositional neural network that predicts the silhouette, garment labels, and textures. Each modular network is explicitly dedicated to a subtask that can be learned from the synthetic data. At the inference time, we utilize the trained network to produce a unified representation of appearance and its labels in UV coordinates, which remains constant across poses. The unified representation provides an incomplete yet strong guidance to generating the appearance in response to the pose change. We use the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
