TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can   Scale Up

Yifan Jiang; Shiyu Chang; Zhangyang Wang

arXiv:2102.07074·cs.CV·December 10, 2021·263 cites

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up

Yifan Jiang, Shiyu Chang, Zhangyang Wang

PDF

Open Access 5 Repos 2 Videos

TL;DR

TransGAN demonstrates that pure transformer architectures can effectively replace convolutional networks in GANs, achieving competitive high-resolution image generation without convolutions, and introduces techniques to stabilize training.

Contribution

This work pioneers the use of fully transformer-based GANs, proposing a novel architecture and training methods to enable high-quality image synthesis without convolutions.

Findings

01

TransGAN achieves state-of-the-art scores on STL-10.

02

It produces diverse, high-fidelity images at 256x256 resolution.

03

The model outperforms convolutional GANs like StyleGAN-V2.

Abstract

The recent explosive interest on transformers has suggested their potential to become powerful "universal" models for computer vision tasks, such as classification, detection, and segmentation. While those attempts mainly study the discriminative models, we explore transformers on some more notoriously difficult vision tasks, e.g., generative adversarial networks (GANs). Our goal is to conduct the first pilot study in building a GAN completely free of convolutions, using only pure transformer-based architectures. Our vanilla GAN architecture, dubbed TransGAN, consists of a memory-friendly transformer-based generator that progressively increases feature resolution, and correspondingly a multi-scale discriminator to capture simultaneously semantic contexts and low-level textures. On top of them, we introduce the new module of grid self-attention for alleviating the memory bottleneck…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

TransGAN: Two Transformers Can Make One Strong GAN (Machine Learning Research Paper Explained)· youtube

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Advanced Image Processing Techniques