ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation

Sherry X. Chen; Yi Wei; Luowei Zhou; Suren Kumar

arXiv:2507.07317·cs.CV·July 29, 2025

ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation

Sherry X. Chen, Yi Wei, Luowei Zhou, Suren Kumar

PDF

TL;DR

ADIEE introduces a large-scale automated dataset and a fine-tuned scoring model for instruction-guided image editing evaluation, outperforming existing open-source models and enabling automated assessment and improvement of editing models.

Contribution

We developed ADIEE, a novel automated dataset creation method and a fine-tuned scoring model that significantly improves evaluation accuracy for instruction-guided image editing.

Findings

01

Outperforms all open-source VLMs and Gemini-Pro 1.5 on benchmarks.

02

Achieves a 17.24% increase in correlation with human ratings.

03

Boosts MagicBrush evaluation score by 8.98%.

Abstract

Recent advances in instruction-guided image editing underscore the need for effective automated evaluation. While Vision-Language Models (VLMs) have been explored as judges, open-source models struggle with alignment, and proprietary models lack transparency and cost efficiency. Additionally, no public training datasets exist to fine-tune open-source VLMs, only small benchmarks with diverse evaluation schemes. To address this, we introduce ADIEE, an automated dataset creation approach which is then used to train a scoring model for instruction-guided image editing evaluation. We generate a large-scale dataset with over 100K samples and use it to fine-tune a LLaVA-NeXT-8B model modified to decode a numeric score from a custom token. The resulting scorer outperforms all open-source VLMs and Gemini-Pro 1.5 across all benchmarks, achieving a 0.0696 (+17.24%) gain in score correlation with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.