Adversarial Text-to-Image Synthesis: A Review

Stanislav Frolov; Tobias Hinz; Federico Raue; J\"orn Hees; Andreas; Dengel

arXiv:2101.09983·cs.CV·October 7, 2021

Adversarial Text-to-Image Synthesis: A Review

Stanislav Frolov, Tobias Hinz, Federico Raue, J\"orn Hees, Andreas, Dengel

PDF

TL;DR

This review paper discusses the progress, challenges, and future directions of adversarial text-to-image synthesis, emphasizing the importance of evaluation metrics, dataset quality, and architectural improvements.

Contribution

It provides a comprehensive taxonomy, critical analysis of current evaluation strategies, and identifies key research gaps in the field of adversarial text-to-image synthesis.

Findings

01

Significant progress in visual realism, diversity, and semantic alignment.

02

Identification of challenges in high-resolution multi-object image generation.

03

Highlighting the need for better evaluation metrics and datasets.

Abstract

With the advent of generative adversarial networks, synthesizing images from textual descriptions has recently become an active research area. It is a flexible and intuitive way for conditional image generation with significant progress in the last years regarding visual realism, diversity, and semantic alignment. However, the field still faces several challenges that require further research efforts such as enabling the generation of high-resolution images with multiple objects, and developing suitable and reliable evaluation metrics that correlate with human judgement. In this review, we contextualize the state of the art of adversarial text-to-image synthesis models, their development since their inception five years ago, and propose a taxonomy based on the level of supervision. We critically examine current strategies to evaluate text-to-image synthesis models, highlight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.