Semantic Image Synthesis via Adversarial Learning

Hao Dong; Simiao Yu; Chao Wu; Yike Guo

arXiv:1707.06873·cs.CV·July 24, 2017·38 cites

Semantic Image Synthesis via Adversarial Learning

Hao Dong, Simiao Yu, Chao Wu, Yike Guo

PDF

Open Access 2 Repos

TL;DR

This paper introduces an adversarial learning framework for generating realistic images from natural language descriptions, capable of maintaining original image features while matching new textual semantics.

Contribution

It presents an end-to-end neural architecture that disentangles semantic information from images and text for improved image synthesis.

Findings

01

Capable of generating realistic images matching text descriptions

02

Maintains original image features unrelated to the description

03

Effective on Caltech-200 bird and Oxford-102 flower datasets

Abstract

In this paper, we propose a way of synthesizing realistic images directly with natural language description, which has many useful applications, e.g. intelligent image manipulation. We attempt to accomplish such synthesis: given a source image and a target text description, our model synthesizes images to meet two requirements: 1) being realistic while matching the target text description; 2) maintaining other image features that are irrelevant to the text description. The model should be able to disentangle the semantic information from the two modalities (image and text), and generate new images from the combined semantics. To achieve this, we proposed an end-to-end neural architecture that leverages adversarial learning to automatically learn implicit loss functions, which are optimized to fulfill the aforementioned two requirements. We have evaluated our model by conducting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Multimodal Machine Learning Applications