Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
Wenkai Dong, Song Xue, Xiaoyue Duan, Shumin Han

TL;DR
This paper introduces Prompt Tuning Inversion, a fast and accurate method for text-driven image editing with diffusion models that balances editability and fidelity, outperforming existing techniques.
Contribution
The paper proposes a novel inversion technique that encodes input images into learnable embeddings, enabling high-quality, user-friendly image editing guided solely by text prompts.
Findings
Outperforms state-of-the-art baselines in ImageNet editing tasks
Achieves a good balance between editability and image fidelity
Enables precise color and object modifications using only text prompts
Abstract
Recently large-scale language-image models (e.g., text-guided diffusion models) have considerably improved the image generation capabilities to generate photorealistic images in various domains. Based on this success, current image editing methods use texts to achieve intuitive and versatile modification of images. To edit a real image using diffusion models, one must first invert the image to a noisy latent from which an edited image is sampled with a target text prompt. However, most methods lack one of the following: user-friendliness (e.g., additional masks or precise descriptions of the input image are required), generalization to larger domains, or high fidelity to the input image. In this paper, we design an accurate and quick inversion technique, Prompt Tuning Inversion, for text-driven image editing. Specifically, our proposed editing method consists of a reconstruction stage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
MethodsDiffusion
