Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis
Mor Ventura, Roy Hirsch, Yonatan Bitton, Regev Cohen, Roi Reichart

TL;DR
This paper introduces a new benchmark and framework for evaluating and understanding abstract image editing instructions, emphasizing the importance of entity-level analysis and advanced language models.
Contribution
It formalizes the concept of abstract image editing, proposes Entity-Rubrics for assessment, and presents AbstractEdit, the first benchmark dedicated to this challenging domain.
Findings
Models struggle to balance intent and preservation in abstract editing.
Advanced LLM encoders and iterative thinking improve performance.
Standard architectures often under-edit or over-edit abstract instructions.
Abstract
Humans naturally communicate through abstract concepts like "mood". However, current image editing benchmarks focus primarily on explicit, literal commands, leaving abstract instructions largely underexplored. In this work, we first formalize the definition and taxonomy of abstract image editing. To measure instruction-following in this challenging domain, we introduce Entity-Rubrics, a framework that breaks down abstract edits into individual, entity-level assessments and achieves strong correlation with human judgment. Alongside this framework, we contribute AbstractEdit, the first benchmark dedicated to abstract image editing across diverse real-world scenes. Evaluating 11 leading models on this dataset reveals a fundamental challenge: standard architectures struggle to balance intent and preservation, commonly defaulting to under-editing or over-editing. Our analysis demonstrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
