An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control
Aosong Feng, Weikang Qiu, Jinbin Bai, Xiao Zhang, Zhen Dong, Kaicheng, Zhou, Rex Ying, Leandros Tassiulas

TL;DR
This paper introduces D-Edit, a novel framework for versatile image editing using disentangled item-prompt interactions in pretrained diffusion models, enabling precise, multi-modal editing including mask-based and item removal tasks.
Contribution
D-Edit uniquely disentangles image-prompt interactions into item-specific prompts, allowing flexible and precise editing across multiple modalities within a unified framework.
Findings
Achieves state-of-the-art results in four editing types.
First framework to enable item editing via mask editing.
Successfully combines image and text-based editing.
Abstract
Building on the success of text-to-image diffusion models (DPMs), image editing is an important application to enable human interaction with AI-generated content. Among various editing methods, editing within the prompt space gains more attention due to its capacity and simplicity of controlling semantics. However, since diffusion models are commonly pretrained on descriptive text captions, direct editing of words in text prompts usually leads to completely different generated images, violating the requirements for image editing. On the other hand, existing editing methods usually consider introducing spatial masks to preserve the identity of unedited regions, which are usually ignored by DPMs and therefore lead to inharmonic editing results. Targeting these two challenges, in this work, we propose to disentangle the comprehensive image-prompt interaction into several item-prompt…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging
MethodsDiffusion
