Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
Seonghyeon Nam, Yunji Kim, Seon Joo Kim

TL;DR
This paper introduces TAGAN, a novel GAN framework that manipulates images based on natural language descriptions while preserving irrelevant content, outperforming existing methods on benchmark datasets.
Contribution
We propose a text-adaptive discriminator that enables fine-grained, attribute-specific image manipulation guided by natural language, improving content preservation and attribute control.
Findings
Outperforms existing methods on CUB and Oxford-102 datasets
Achieves higher user preference scores in evaluations
Effectively disentangles visual attributes for realistic modifications
Abstract
This paper addresses the problem of manipulating images using natural language description. Our task aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance. Although existing methods synthesize images having new attributes, they do not fully preserve text-irrelevant contents of the original image. In this paper, we propose the text-adaptive generative adversarial network (TAGAN) to generate semantically manipulated images while preserving text-irrelevant contents. The key to our method is the text-adaptive discriminator that creates word-level local discriminators according to input text to classify fine-grained attributes independently. With this discriminator, the generator learns to generate images where only regions that correspond to the given text are modified. Experimental results show that our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Handwritten Text Recognition Techniques
