CRD-CGAN: Category-Consistent and Relativistic Constraints for Diverse   Text-to-Image Generation

Tao Hu; Chengjiang Long; Chunxia Xiao

arXiv:2107.13516·cs.CV·July 29, 2021

CRD-CGAN: Category-Consistent and Relativistic Constraints for Diverse Text-to-Image Generation

Tao Hu, Chengjiang Long, Chunxia Xiao

PDF

TL;DR

CRD-CGAN introduces category-consistent and relativistic constraints to enhance diversity and realism in text-to-image generation, outperforming existing methods on multiple datasets.

Contribution

The paper proposes a novel CRD-CGAN model with attention, diversity, relativistic, and category-consistent losses for improved photo-realistic and diverse image synthesis from text.

Findings

01

Outperforms state-of-the-art in photorealism and diversity

02

Effective in multiple datasets including Birds-200-2011, Oxford-102, MSCOCO

03

Enhances sensitivity to word attention and noise in image generation

Abstract

Generating photo-realistic images from a text description is a challenging problem in computer vision. Previous works have shown promising performance to generate synthetic images conditional on text by Generative Adversarial Networks (GANs). In this paper, we focus on the category-consistent and relativistic diverse constraints to optimize the diversity of synthetic images. Based on those constraints, a category-consistent and relativistic diverse conditional GAN (CRD-CGAN) is proposed to synthesize $K$ photo-realistic images simultaneously. We use the attention loss and diversity loss to improve the sensitivity of the GAN to word attention and noises. Then, we employ the relativistic conditional loss to estimate the probability of relatively real or fake for synthetic images, which can improve the performance of basic conditional loss. Finally, we introduce a category-consistent loss…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.