Text to Image Synthesis Using Generative Adversarial Networks
Cristian Bodnar

TL;DR
This paper introduces Wasserstein GAN-CLS, a new conditional GAN model for text-to-image synthesis that improves stability and image quality, achieving higher Inception Scores on benchmark datasets.
Contribution
The paper proposes Wasserstein GAN-CLS, a novel stable conditional GAN model utilizing Wasserstein distance, and demonstrates its effectiveness in boosting image quality in text-to-image synthesis.
Findings
Wasserstein GAN-CLS improves stability in conditional image generation.
The model boosts Inception Score by 7.07% on Caltech birds dataset.
Outperforms previous models using sentence-level semantics, close to AttnGAN.
Abstract
Generating images from natural language is one of the primary applications of recent conditional generative models. Besides testing our ability to model conditional, highly dimensional distributions, text to image synthesis has many exciting and practical applications such as photo editing or computer-aided content creation. Recent progress has been made using Generative Adversarial Networks (GANs). This material starts with a gentle introduction to these topics and discusses the existent state of the art models. Moreover, I propose Wasserstein GAN-CLS, a new model for conditional image generation based on the Wasserstein distance which offers guarantees of stability. Then, I show how the novel loss function of Wasserstein GAN-CLS can be used in a Conditional Progressive Growing GAN. In combination with the proposed loss, the model boosts by 7.07% the best Inception Score (on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Dogecoin Customer Service Number +1-833-534-1729
