Object-driven Text-to-Image Synthesis via Adversarial Training
Wenbo Li, Pengchuan Zhang, Lei Zhang, Qiuyuan Huang, Xiaodong He,, Siwei Lyu, Jianfeng Gao

TL;DR
This paper introduces Obj-GANs, a novel object-driven adversarial network for complex scene text-to-image synthesis, improving object relevance and scene quality through attentive generation and object-wise discrimination.
Contribution
The paper presents a new object-driven attentive generator and an object-wise discriminator, enhancing text-to-image synthesis with better object relevance and scene coherence.
Findings
Outperforms previous methods on COCO benchmark
Increases Inception score by 27%
Reduces FID score by 11%
Abstract
In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow object-centered text-to-image synthesis for complex scenes. Following the two-step (layout-image) generation process, a novel object-driven attentive image generator is proposed to synthesize salient objects by paying attention to the most relevant words in the text description and the pre-generated semantic layout. In addition, a new Fast R-CNN based object-wise discriminator is proposed to provide rich object-wise discrimination signals on whether the synthesized object matches the text description and the pre-generated layout. The proposed Obj-GAN significantly outperforms the previous state of the art in various metrics on the large-scale COCO benchmark, increasing the Inception score by 27% and decreasing the FID score by 11%. A thorough comparison between the traditional grid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Computer Graphics and Visualization Techniques
MethodsSoftmax · Convolution · RoIPool · Fast R-CNN
