Controlling Style and Semantics in Weakly-Supervised Image Generation

Dario Pavllo; Aurelien Lucchi; Thomas Hofmann

arXiv:1912.03161·cs.CV·November 23, 2020

Controlling Style and Semantics in Weakly-Supervised Image Generation

Dario Pavllo, Aurelien Lucchi, Thomas Hofmann

PDF

1 Repo

TL;DR

This paper introduces a weakly-supervised method for controllable image generation that uses semantic maps and textual descriptions, enabling detailed scene manipulation with improved quality and flexibility.

Contribution

It presents a novel weakly-supervised framework with a semantic attention module and a two-step generation process for enhanced scene control.

Findings

01

Achieves better FID scores than fully-supervised models.

02

Demonstrates scene manipulation on complex datasets like COCO and Visual Genome.

03

Utilizes unlabeled data through a large-vocabulary object detector.

Abstract

We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as textual descriptions or attributes to control both local and global style. In order to condition our model on textual descriptions, we introduce a semantic attention module whose computational cost is independent of the image resolution. To further augment the controllability of the scene, we propose a two-step generation scheme that decomposes background and foreground. The label maps used to train our model are produced by a large-vocabulary object detector, which enables access to unlabeled data and provides structured instance information. In such a setting, we report better FID scores compared to fully-supervised settings where the model is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dariopavllo/style-semantics
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.