DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
Gwanghyun Kim, Taesung Kwon, Jong Chul Ye

TL;DR
DiffusionCLIP introduces a text-guided image manipulation method using diffusion models, overcoming limitations of GAN-based approaches in handling diverse, real-world images with novel poses and contents, achieving robust zero-shot manipulation.
Contribution
The paper presents DiffusionCLIP, a novel diffusion model-based approach for zero-shot, text-guided image manipulation that handles diverse real images and multi-attribute editing.
Findings
Outperforms existing baselines in manipulation quality.
Effective manipulation across unseen domains and diverse datasets.
Human evaluations confirm robustness and superiority.
Abstract
Recently, GAN inversion methods combined with Contrastive Language-Image Pretraining (CLIP) enables zero-shot image manipulation guided by text prompts. However, their applications to diverse real images are still difficult due to the limited GAN inversion capability. Specifically, these approaches often have difficulties in reconstructing images with novel poses, views, and highly variable contents compared to the training data, altering object identity, or producing unwanted image artifacts. To mitigate these problems and enable faithful manipulation of real images, we propose a novel method, dubbed DiffusionCLIP, that performs text-driven image manipulation using diffusion models. Based on full inversion capability and high-quality image generation power of recent diffusion models, our method performs zero-shot image manipulation successfully even between unseen domains and takes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
