StackGAN++: Realistic Image Synthesis with Stacked Generative   Adversarial Networks

Han Zhang; Tao Xu; Hongsheng Li; Shaoting Zhang; Xiaogang Wang,; Xiaolei Huang; Dimitris Metaxas

arXiv:1710.10916·cs.CV·June 29, 2018

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang,, Xiaolei Huang, Dimitris Metaxas

PDF

5 Repos 1 Video

TL;DR

StackGAN++ introduces a multi-stage GAN architecture that significantly improves the quality and resolution of generated images, advancing the state-of-the-art in photo-realistic image synthesis from text and other inputs.

Contribution

The paper presents StackGAN-v1 and StackGAN-v2, novel multi-stage GAN architectures that enhance image resolution and realism, with StackGAN-v2 offering more stable training and multi-scale image generation.

Findings

01

StackGAN-v1 effectively generates low-res images from text descriptions.

02

StackGAN-v2 produces high-resolution, photo-realistic images with stable training.

03

StackGAN architectures outperform previous methods in image quality.

Abstract

Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of the object based on given text description, yielding low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. Second, an advanced multi-stage generative adversarial network architecture, StackGAN-v2, is proposed for both conditional and unconditional generative tasks. Our StackGAN-v2 consists of multiple generators and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.· youtube

Taxonomy

MethodsConvolution · Dogecoin Customer Service Number +1-833-534-1729