Improving Text-to-Image Synthesis Using Contrastive Learning
Hui Ye, Xiulong Yang, Martin Takac, Rajshekhar Sunderraman, Shihao Ji

TL;DR
This paper introduces a contrastive learning framework to improve the semantic consistency and quality of images generated from text descriptions, addressing caption variance issues in text-to-image synthesis.
Contribution
It proposes a novel contrastive learning method applied during pretraining and GAN training to enhance semantic alignment in text-to-image models.
Findings
Significant improvement in FID scores on COCO dataset
Enhanced semantic consistency in generated images
Better performance on quality metrics like IS and R-precision
Abstract
The goal of text-to-image synthesis is to generate a visually realistic image that matches a given text description. In practice, the captions annotated by humans for the same image have large variance in terms of contents and the choice of words. The linguistic discrepancy between the captions of the identical image leads to the synthetic images deviating from the ground truth. To address this issue, we propose a contrastive learning approach to improve the quality and enhance the semantic consistency of synthetic images. In the pretraining stage, we utilize the contrastive learning approach to learn the consistent textual representations for the captions corresponding to the same image. Furthermore, in the following stage of GAN training, we employ the contrastive learning method to enhance the consistency between the generated images from the captions related to the same image. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques
MethodsContrastive Learning
