EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement

Zitong Xu; Huiyu Duan; Yifei Nie; Mingda Du; Sijing Wu; Xiongkuo Min; Tianyi Zheng; Jian Zhang; Shusong Xu; Jinwei Chen; Bo Li; Guangtao Zhai

arXiv:2605.07457·cs.CV·May 11, 2026

EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement

Zitong Xu, Huiyu Duan, Yifei Nie, Mingda Du, Sijing Wu, Xiongkuo Min, Tianyi Zheng, Jian Zhang, Shusong Xu, Jinwei Chen, Bo Li, Guangtao Zhai

PDF

1 Repo

TL;DR

EditRefiner introduces a human-aligned, hierarchical framework for image editing refinement, leveraging a new dataset and perception-reasoning-action-evaluation loop to improve local corrections and perceptual quality.

Contribution

It presents a novel dataset and a hierarchical agentic framework for human-aligned, self-corrective image editing refinement, outperforming existing methods.

Findings

01

Outperforms state-of-the-art in distortion localization and diagnose accuracy.

02

Achieves higher human perception alignment.

03

Establishes a new paradigm for self-corrective image editing.

Abstract

Recent text-guided image editing (TIE) models have made remarkable progress, yet edited images still frequently suffer from fine-grained issues such as unnatural objects, lighting mismatch, and unexpected changes. Existing refinement approaches either rely on costly iterative regeneration or employ vision-language models (VLMs) with weak spatial grounding, often resulting in semantic drift and unreliable local corrections. To address these limitations, we first construct EditFHF-15K, a dataset of fine-grained human feedback for edited images, comprising (1) 15K images from 12 TIE models spanning 43 editing tasks, (2) 60K annotated artifact regions and 80K editing failure regions, each accompanied by textual reasoning, and (3) 45K mean opinion scores (MOSs) assessing perceptual quality, instruction following, and visual consistency. Based on EditFHF-15K, we propose EditRefiner, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IntMeGroup/EditRefiner
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.