More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan,, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell

TL;DR
This paper introduces a unified semantic diffusion guidance framework enabling fine-grained, controllable image synthesis using text or image guidance without retraining the diffusion model, demonstrated on multiple datasets.
Contribution
The paper presents a novel method for controlling diffusion-based image synthesis with flexible guidance types, without needing to retrain the model.
Findings
Effective fine-grained control with text and image guidance
Successful synthesis on FFHQ and LSUN datasets
Ability to combine textual and visual guidance
Abstract
Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores, without re-training the diffusion model. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
More Control for Free! Image Synthesis with Semantic Diffusion Guidance· youtube
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
MethodsDiffusion
