EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits

Ron Yosef; Moran Yanuka; Yonatan Bitton; Dani Lischinski

arXiv:2506.09988·cs.CV·June 12, 2025

EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits

Ron Yosef, Moran Yanuka, Yonatan Bitton, Dani Lischinski

PDF

Open Access 1 Video

TL;DR

EditInspector is a new benchmark for evaluating text-guided image edits, revealing current models' limitations and introducing methods that improve artifact detection and change description.

Contribution

We introduce EditInspector, a comprehensive benchmark for assessing text-guided image edits, and propose two methods that outperform existing models in key evaluation tasks.

Findings

01

Current models struggle with comprehensive evaluation.

02

Models often hallucinate when describing edits.

03

Proposed methods outperform state-of-the-art in artifact detection and change captioning.

Abstract

Text-guided image editing, fueled by recent advancements in generative AI, is becoming increasingly widespread. This trend highlights the need for a comprehensive framework to verify text-guided edits and assess their quality. To address this need, we introduce EditInspector, a novel benchmark for evaluation of text-guided image edits, based on human annotations collected using an extensive template for edit verification. We leverage EditInspector to evaluate the performance of state-of-the-art (SoTA) vision and language models in assessing edits across various dimensions, including accuracy, artifact detection, visual quality, seamless integration with the image scene, adherence to common sense, and the ability to describe edit-induced changes. Our findings indicate that current models struggle to evaluate edits comprehensively and frequently hallucinate when describing the changes. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence Applications