MIRAGE: Benchmarking and Aligning Multi-Instance Image Editing
Ziqian Liu, Stephan Alaniz

TL;DR
MIRAGE introduces a training-free, region-specific editing framework that improves multi-instance image editing accuracy and consistency, addressing over-editing and misalignment issues in complex scenarios.
Contribution
It presents a new benchmark for multi-instance editing and a novel, training-free method that enhances localized editing precision and background preservation.
Findings
MIRAGE significantly outperforms existing methods on MIRA-Bench and RefEdit-Bench.
The framework achieves precise, instance-level modifications with high background consistency.
Extensive evaluations validate MIRAGE's effectiveness in complex multi-instance editing scenarios.
Abstract
Instruction-guided image editing has seen remarkable progress with models like FLUX.2 and Qwen-Image-Edit, yet they still struggle with complex scenarios with multiple similar instances each requiring individual edits. We observe that state-of-the-art models suffer from severe over-editing and spatial misalignment when faced with multiple identical instances and composite instructions. To this end, we introduce a comprehensive benchmark specifically designed to evaluate fine-grained consistency in multi-instance and multi-instruction settings. To address the failures of existing methods observed in our benchmark, we propose Multi-Instance Regional Alignment via Guided Editing (MIRAGE), a training-free framework that enables precise, localized editing. By leveraging a vision-language model to parse complex instructions into regional subsets, MIRAGE employs a multi-branch parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
