ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation
Sherry X. Chen, Yi Wei, Luowei Zhou, Suren Kumar

TL;DR
ADIEE introduces a large-scale automated dataset and a fine-tuned scoring model for instruction-guided image editing evaluation, outperforming existing open-source models and enabling automated assessment and improvement of editing models.
Contribution
We developed ADIEE, a novel automated dataset creation method and a fine-tuned scoring model that significantly improves evaluation accuracy for instruction-guided image editing.
Findings
Outperforms all open-source VLMs and Gemini-Pro 1.5 on benchmarks.
Achieves a 17.24% increase in correlation with human ratings.
Boosts MagicBrush evaluation score by 8.98%.
Abstract
Recent advances in instruction-guided image editing underscore the need for effective automated evaluation. While Vision-Language Models (VLMs) have been explored as judges, open-source models struggle with alignment, and proprietary models lack transparency and cost efficiency. Additionally, no public training datasets exist to fine-tune open-source VLMs, only small benchmarks with diverse evaluation schemes. To address this, we introduce ADIEE, an automated dataset creation approach which is then used to train a scoring model for instruction-guided image editing evaluation. We generate a large-scale dataset with over 100K samples and use it to fine-tune a LLaVA-NeXT-8B model modified to decode a numeric score from a custom token. The resulting scorer outperforms all open-source VLMs and Gemini-Pro 1.5 across all benchmarks, achieving a 0.0696 (+17.24%) gain in score correlation with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
