A Framework For Image Synthesis Using Supervised Contrastive Learning

Yibin Liu; Jianyu Zhang; Li Zhang; Shijian Li; and Gang Pan

arXiv:2412.03957·cs.CV·December 6, 2024

A Framework For Image Synthesis Using Supervised Contrastive Learning

Yibin Liu, Jianyu Zhang, Li Zhang, Shijian Li, and Gang Pan

PDF

Open Access

TL;DR

This paper introduces a novel framework for text-to-image generation that leverages supervised contrastive learning to better utilize both inter- and inner-modal semantic relationships, significantly improving image quality.

Contribution

It proposes a dual-branch contrastive learning approach integrated into T2I GANs, enhancing semantic clustering and image realism beyond prior methods.

Findings

01

Significant improvements in Inception Score and FID across datasets.

02

Enhanced image quality on complex multi-object datasets.

03

Outperforms existing label-guided T2I GANs.

Abstract

Text-to-image (T2I) generation aims at producing realistic images corresponding to text descriptions. Generative Adversarial Network (GAN) has proven to be successful in this task. Typical T2I GANs are 2 phase methods that first pretrain an inter-modal representation from aligned image-text pairs and then use GAN to train image generator on that basis. However, such representation ignores the inner-modal semantic correspondence, e.g. the images with same label. The semantic label in priory describes the inherent distribution pattern with underlying cross-image relationships, which is supplement to the text description for understanding the full characteristics of image. In this paper, we propose a framework leveraging both inter- and inner-modal correspondence by label guided supervised contrastive learning. We extend the T2I GANs to two parameter-sharing contrast branches in both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques