Fine-grained Text to Image Synthesis

Xu Ouyang; Ying Chen; Kaiyue Zhu; Gady Agam

arXiv:2412.07196·cs.CV·December 17, 2024

Fine-grained Text to Image Synthesis

Xu Ouyang, Ying Chen, Kaiyue Zhu, Gady Agam

PDF

Open Access

TL;DR

This paper enhances fine-grained text-to-image synthesis by integrating an auxiliary classifier and contrastive learning into GANs, leading to more accurate and detailed image generation from complex textual descriptions.

Contribution

It introduces a novel approach combining auxiliary classifiers and contrastive learning to improve fine-grained detail accuracy in GAN-based text-to-image synthesis.

Findings

01

Outperforms existing methods on CUB-200-2011 and Oxford-102 datasets.

02

Achieves higher accuracy in classifying fine-grained details.

03

Produces more realistic and detailed images from complex texts.

Abstract

Fine-grained text to image synthesis involves generating images from texts that belong to different categories. In contrast to general text to image synthesis, in fine-grained synthesis there is high similarity between images of different subclasses, and there may be linguistic discrepancy among texts describing the same image. Recent Generative Adversarial Networks (GAN), such as the Recurrent Affine Transformation (RAT) GAN model, are able to synthesize clear and realistic images from texts. However, GAN models ignore fine-grained level information. In this paper we propose an approach that incorporates an auxiliary classifier in the discriminator and a contrastive learning method to improve the accuracy of fine-grained details in images synthesized by RAT GAN. The auxiliary classifier helps the discriminator classify the class of images, and helps the generator synthesize more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction

MethodsContrastive Learning · Auxiliary Classifier