Zero-shot spatial layout conditioning for text-to-image diffusion models
Guillaume Couairon, Marl\`ene Careil, Matthieu Cord, St\'ephane, Lathuili\`ere, Jakob Verbeek

TL;DR
This paper introduces ZestGuide, a zero-shot segmentation guidance method for text-to-image diffusion models that enables precise spatial control using implicit segmentation maps without additional training.
Contribution
It presents a novel zero-shot approach that integrates segmentation guidance into pre-trained diffusion models, improving spatial accuracy without extra training.
Findings
Enhanced spatial alignment with input masks
Improved mIoU scores on COCO dataset
Maintained high image quality with better segmentation accuracy
Abstract
Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process. Expressing spatial constraints, e.g. to position specific objects in particular locations, is cumbersome using text; and current text-based image generation models are not able to accurately follow such instructions. In this paper we consider image generation from text associated with segments on the image canvas, which combines an intuitive natural language interface with precise spatial control over the generated content. We propose ZestGuide, a zero-shot segmentation guidance approach that can be plugged into pre-trained text-to-image diffusion models, and does not require any additional training. It leverages implicit segmentation maps that can be extracted from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
MethodsALIGN · Diffusion
