DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image   Generation

Zhenxing Zhang; Lambert Schomaker

arXiv:2011.02709·cs.CV·May 10, 2022

DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation

Zhenxing Zhang, Lambert Schomaker

PDF

TL;DR

DTGAN introduces a single-generator, attention-based model for text-to-image synthesis that improves image quality and semantic consistency, reducing complexity and training time compared to multi-stage approaches.

Contribution

The paper proposes DTGAN, a novel single-generator/discriminator framework with attention modules and a new visual loss for improved text-to-image generation.

Findings

01

Outperforms state-of-the-art multi-stage models on benchmark datasets.

02

Attention modules effectively localize discriminative regions and capture global visual content.

03

Enhances image resolution with a new visual loss ensuring vivid shapes and colors.

Abstract

Most existing text-to-image generation methods adopt a multi-stage modular architecture which has three significant problems: 1) Training multiple networks increases the run time and affects the convergence and stability of the generative model; 2) These approaches ignore the quality of early-stage generator images; 3) Many discriminators need to be trained. To this end, we propose the Dual Attention Generative Adversarial Network (DTGAN) which can synthesize high-quality and semantically consistent images only employing a single generator/discriminator pair. The proposed model introduces channel-aware and pixel-aware attention modules that can guide the generator to focus on text-relevant channels and pixels based on the global sentence vector and to fine-tune original feature maps using attention weights. Also, Conditional Adaptive Instance-Layer Normalization (CAdaILN) is presented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSix Ways To Communicate To Someone At Expedia Via Phone And Email's.