I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models

Juntong Wang; Jiarui Wang; Huiyu Duan; Jiaxiang Kang; Guangtao Zhai; Xiongkuo Min

arXiv:2512.04660·cs.CV·December 5, 2025

I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models

Juntong Wang, Jiarui Wang, Huiyu Duan, Jiaxiang Kang, Guangtao Zhai, Xiongkuo Min

PDF

Open Access

TL;DR

I2I-Bench is a comprehensive, multi-task benchmark suite for evaluating image-to-image editing models across diverse tasks and evaluation dimensions, combining automated tools and human validation.

Contribution

The paper introduces I2I-Bench, a new benchmark with diverse tasks, detailed evaluation metrics, and automated assessment methods for image editing models.

Findings

01

Benchmark reveals gaps and trade-offs among existing models.

02

Automated evaluation methods align well with human preferences.

03

I2I-Bench covers 10 task categories and 30 evaluation dimensions.

Abstract

Image editing models are advancing rapidly, yet comprehensive evaluation remains a significant challenge. Existing image editing benchmarks generally suffer from limited task scopes, insufficient evaluation dimensions, and heavy reliance on manual annotations, which significantly constrain their scalability and practical applicability. To address this, we propose \textbf{I2I-Bench}, a comprehensive benchmark for image-to-image editing models, which features (i) diverse tasks, encompassing 10 task categories across both single-image and multi-image editing tasks, (ii) comprehensive evaluation dimensions, including 30 decoupled and fine-grained evaluation dimensions with automated hybrid evaluation methods that integrate specialized tools and large multimodal models (LMMs), and (iii) rigorous alignment validation, justifying the consistency between our benchmark evaluations and human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection