MaskSketch: Unpaired Structure-guided Masked Image Generation

Dina Bashkirova; Jose Lezama; Kihyuk Sohn; Kate Saenko; Irfan Essa

arXiv:2302.05496·cs.CV·February 14, 2023·1 cites

MaskSketch: Unpaired Structure-guided Masked Image Generation

Dina Bashkirova, Jose Lezama, Kihyuk Sohn, Kate Saenko, Irfan Essa

PDF

Open Access 2 Repos 1 Models

TL;DR

MaskSketch is a novel image generation method that enables spatial control over generated images using sketches as guidance, leveraging a pre-trained transformer without additional training.

Contribution

It introduces a structure-guided sampling technique utilizing self-attention maps in a pre-trained transformer, allowing unpaired, sketch-based image generation with high fidelity.

Findings

01

Outperforms state-of-the-art sketch-to-image translation methods

02

Achieves high realism and structural fidelity in generated images

03

Operates without additional training or paired supervision

Abstract

Recent conditional image generation methods produce images of remarkable diversity, fidelity and realism. However, the majority of these methods allow conditioning only on labels or text prompts, which limits their level of control over the generation result. In this paper, we introduce MaskSketch, an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling. MaskSketch utilizes a pre-trained masked generative transformer, requiring no model training or paired supervision, and works with input sketches of different levels of abstraction. We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
Lakshmanaraja/ControlNetGit
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis