IE-Critic-R1: Advancing the Explanatory Measurement of Text-Driven Image Editing for Human Perception Alignment

Bowen Qu; Shangkun Sun; Xiaoyu Liang; Wei Gao

arXiv:2511.18055·cs.CV·November 25, 2025

IE-Critic-R1: Advancing the Explanatory Measurement of Text-Driven Image Editing for Human Perception Alignment

Bowen Qu, Shangkun Sun, Xiaoyu Liang, Wei Gao

PDF

Open Access

TL;DR

This paper introduces a new benchmark and an explainable evaluation model for text-driven image editing, improving alignment with human perception and addressing limitations of previous assessment methods.

Contribution

The work presents IE-Bench, a comprehensive dataset, and IE-Critic-R1, a reinforcement learning-based metric that better correlates with human perception in image editing evaluation.

Findings

01

IE-Critic-R1 outperforms previous metrics in subjective alignment.

02

The benchmark includes nearly 4,000 samples with human scores.

03

The method provides more explainable quality assessments.

Abstract

Recent advances in text-driven image editing have been significant, yet the task of accurately evaluating these edited images continues to pose a considerable challenge. Different from the assessment of text-driven image generation, text-driven image editing is characterized by simultaneously conditioning on both text and a source image. The edited images often retain an intrinsic connection to the original image, which dynamically change with the semantics of the text. However, previous methods tend to solely focus on text-image alignment or have not well aligned with human perception. In this work, we introduce the Text-driven Image Editing Benchmark suite (IE-Bench) to enhance the assessment of text-driven edited images. IE-Bench includes a database contains diverse source images, various editing prompts and the corresponding edited results from different editing methods, and nearly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Humanities and Scholarship · Multimodal Machine Learning Applications