Imagic: Text-Based Real Image Editing with Diffusion Models

Bahjat Kawar; Shiran Zada; Oran Lang; Omer Tov; Huiwen Chang; Tali; Dekel; Inbar Mosseri; Michal Irani

arXiv:2210.09276·cs.CV·March 21, 2023·34 cites

Imagic: Text-Based Real Image Editing with Diffusion Models

Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali, Dekel, Inbar Mosseri, Michal Irani

PDF

Open Access

TL;DR

Imagic is a novel method that enables complex, non-rigid, text-guided edits to real images using a single input image and a pre-trained diffusion model, without requiring masks or multiple views.

Contribution

This work introduces the first approach for applying complex semantic edits to real images with only one input image and text, leveraging diffusion models and fine-tuning for high-quality results.

Findings

01

Enables diverse complex edits like posture and object changes

02

Operates on high-resolution real images without masks or multiple views

03

Produces high-quality, natural-looking edited images

Abstract

Text-conditioned image editing has recently attracted considerable interest. However, most methods are currently either limited to specific editing types (e.g., object overlay, style transfer), or apply to synthetically generated images, or require multiple input images of a common object. In this paper we demonstrate, for the very first time, the ability to apply complex (e.g., non-rigid) text-guided semantic edits to a single real image. For example, we can change the posture and composition of one or multiple objects inside an image, while preserving its original characteristics. Our method can make a standing dog sit down or jump, cause a bird to spread its wings, etc. -- each within its single high-resolution natural image provided by the user. Contrary to previous work, our proposed method requires only a single input image and a target text (the desired edit). It operates on real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Image Retrieval and Classification Techniques

MethodsDiffusion