MirrorGAN: Learning Text-to-image Generation by Redescription

Tingting Qiao; Jing Zhang; Duanqing Xu; Dacheng Tao

arXiv:1903.05854·cs.CL·March 15, 2019·72 cites

MirrorGAN: Learning Text-to-image Generation by Redescription

Tingting Qiao, Jing Zhang, Duanqing Xu, Dacheng Tao

PDF

Open Access 2 Repos

TL;DR

MirrorGAN introduces a novel text-to-image generation framework that emphasizes semantic consistency through a redescription approach, utilizing cascaded attention modules and text regeneration to produce more accurate images from descriptions.

Contribution

The paper proposes MirrorGAN, a new framework that enforces semantic alignment in text-to-image synthesis via a redescription mechanism and cascaded attention modules.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets

02

Achieves higher semantic consistency in generated images

03

Demonstrates effective text-image-text alignment

Abstract

Generating an image from a given text description has two goals: visual realism and semantic consistency. Although significant progress has been made in generating high-quality and visually realistic images using generative adversarial networks, guaranteeing semantic consistency between the text description and visual content remains very challenging. In this paper, we address this problem by proposing a novel global-local attentive and semantic-preserving text-to-image-to-text framework called MirrorGAN. MirrorGAN exploits the idea of learning text-to-image generation by redescription and consists of three modules: a semantic text embedding module (STEM), a global-local collaborative attentive module for cascaded image generation (GLAM), and a semantic text regeneration and alignment module (STREAM). STEM generates word- and sentence-level embeddings. GLAM has a cascaded architecture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization