Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
Ekaterina Iakovleva, Fabio Pizzati, Philip Torr, St\'ephane, Lathuili\`ere

TL;DR
SANE introduces a zero-shot diffusion-based image editing pipeline that leverages large language models to decompose ambiguous instructions into specific edits, improving interpretability, diversity, and performance across datasets.
Contribution
The paper presents a novel zero-shot method using LLMs for decomposing ambiguous instructions, enhancing diffusion-based image editing with a new guidance strategy.
Findings
Improves editing accuracy on ambiguous instructions
Enhances output diversity and interpretability
Demonstrates effectiveness across multiple datasets and baselines
Abstract
Text-based editing diffusion models exhibit limited performance when the user's input instruction is ambiguous. To solve this problem, we propose (SANE), a zero-shot inference pipeline for diffusion-based editing systems. We use a large language model (LLM) to decompose the input instruction into specific instructions, i.e. well-defined interventions to apply to the input image to satisfy the user's request. We benefit from the LLM-derived instructions along the original one, thanks to a novel denoising guidance strategy specifically designed for the task. Our experiments with three baselines and on two datasets demonstrate the benefits of SANE in all setups. Moreover, our pipeline improves the interpretability of editing models, and boosts the output diversity. We also demonstrate that our approach can be applied to any edit, whether ambiguous or not. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Digital Humanities and Scholarship · Scientific Computing and Data Management
MethodsDiffusion
