Scene Designer: a Unified Model for Scene Search and Synthesis from   Sketch

Leo Sampaio Ferraz Ribeiro; Tu Bui; John Collomosse; Moacir; Ponti

arXiv:2108.07353·cs.CV·August 18, 2021

Scene Designer: a Unified Model for Scene Search and Synthesis from Sketch

Leo Sampaio Ferraz Ribeiro, Tu Bui, John Collomosse, Moacir, Ponti

PDF

1 Repo

TL;DR

Scene Designer introduces a unified model that combines scene search and synthesis from sketches using a graph neural network and Transformer, enabling accurate scene retrieval and coherent layout generation.

Contribution

A novel unified model that jointly learns cross-modal scene search and layout synthesis from sketches, utilizing GNN and Transformer with contrastive learning.

Findings

01

State-of-the-art sketch-based scene search accuracy

02

Effective scene layout synthesis from sketches

03

Unified framework for search and synthesis

Abstract

Scene Designer is a novel method for searching and generating images using free-hand sketches of scene compositions; i.e. drawings that describe both the appearance and relative positions of objects. Our core contribution is a single unified model to learn both a cross-modal search embedding for matching sketched compositions to images, and an object embedding for layout synthesis. We show that a graph neural network (GNN) followed by Transformer under our novel contrastive learning setting is required to allow learning correlations between object type, appearance and arrangement, driving a mask generation module that synthesises coherent scene layouts, whilst also delivering state of the art sketch based visual search of scenes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leosampaio/scene-designer
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Graph Neural Network · Linear Layer · Contrastive Learning · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Layer Normalization