Towards Better Adversarial Synthesis of Human Images from Text
Rania Briq, Pratika Kochar, Juergen Gall

TL;DR
This paper introduces a method for generating multiple 3D human meshes from text descriptions, improving the realism and interaction modeling of human figures in synthesized images.
Contribution
It presents a novel approach that synthesizes 3D human shapes from text and integrates these shapes into image synthesis to enhance realism.
Findings
Effective generation of diverse 3D human meshes from text
Improved realism in human shape synthesis within images
Captures scene dynamics and interactions from textual descriptions
Abstract
This paper proposes an approach that generates multiple 3D human meshes from text. The human shapes are represented by 3D meshes based on the SMPL model. The model's performance is evaluated on the COCO dataset, which contains challenging human shapes and intricate interactions between individuals. The model is able to capture the dynamics of the scene and the interactions between individuals based on text. We further show how using such a shape as input to image synthesis frameworks helps to constrain the network to synthesize humans with realistic human shapes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
