ZONE: Zero-Shot Instruction-Guided Local Editing
Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming, Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

TL;DR
ZONE introduces a zero-shot, instruction-guided local image editing method that enables precise, user-friendly modifications of specific regions in images without affecting the rest, outperforming existing techniques.
Contribution
The paper presents a novel zero-shot local editing approach using instruction conversion, region extraction, and seamless blending, improving precision and usability over prior methods.
Findings
Achieves superior local editing quality compared to state-of-the-art methods.
Enables arbitrary region manipulation with a single instruction.
Demonstrates high user-friendliness and editing accuracy.
Abstract
Recent advances in vision-language models like Stable Diffusion have shown remarkable power in creative image synthesis and editing.However, most existing text-to-image editing methods encounter two obstacles: First, the text prompt needs to be carefully crafted to achieve good results, which is not intuitive or user-friendly. Second, they are insensitive to local edits and can irreversibly affect non-edited regions, leaving obvious editing traces. To tackle these problems, we propose a Zero-shot instructiON-guided local image Editing approach, termed ZONE. We first convert the editing intent from the user-provided instruction (e.g., "make his tie blue") into specific image editing regions through InstructPix2Pix. We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model. We further develop an edge smoother based on FFT for seamless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques
MethodsDiffusion
