Dual Contrastive Loss and Attention for GANs
Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro,, Larry Davis, Mario Fritz

TL;DR
This paper introduces a dual contrastive loss and enhanced attention mechanisms in GANs, significantly improving image quality and diversity on challenging datasets by refining discriminator representations and generator attention modules.
Contribution
It proposes a novel dual contrastive loss and explores attention architectures, leading to state-of-the-art improvements in FID scores for image generation.
Findings
Achieved at least 17.5% FID improvement on benchmark datasets.
Significantly enhanced image quality on high-variance datasets.
Demonstrated importance of attention modules in GANs.
Abstract
Generative Adversarial Networks (GANs) produce impressive results on unconditional image generation when powered with large-scale image datasets. Yet generated images are still easy to spot especially on datasets with high variance (e.g. bedroom, church). In this paper, we propose various improvements to further push the boundaries in image generation. Specifically, we propose a novel dual contrastive loss and show that, with this loss, discriminator learns more generalized and distinguishable representations to incentivize generation. In addition, we revisit attention and extensively experiment with different attention blocks in the generator. We find attention to be still an important module for successful image generation even though it was not used in the recent state-of-the-art models. Lastly, we study different attention architectures in the discriminator, and propose a reference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Multimodal Machine Learning Applications
