DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model
Shibo Hong, Boxian Ai, Jun Kuang, Wei Wang, FengJiao Chen, Zhongyuan Peng, Chenhao Huang, Yixin Cao

TL;DR
DLEBench is a new benchmark designed to evaluate instruction-based image editing models' ability to precisely edit small objects, revealing significant performance gaps and guiding future improvements.
Contribution
The paper introduces DLEBench, the first dedicated benchmark for small-scale object editing in instruction-based image editing models, with a comprehensive evaluation protocol and diverse challenging samples.
Findings
Empirical results show significant performance gaps in current models' small-object editing abilities.
The benchmark includes 1889 samples with complex scenarios like occlusion and multi-object editing.
A dual-mode evaluation framework addresses the misalignment between automated and human judgments.
Abstract
Significant progress has been made in the field of Instruction-based Image Editing Models (IIEMs). However, while these models demonstrate plausible adherence to instructions and strong reasoning ability on current benchmarks, their ability to edit small objects remains underexplored, despite its importance for precise local editing and refining details in both real and generated images. In this paper, we introduce DeepLookEditBench (DLEBench), the first benchmark dedicated to assessing the abilities of IIEMs in editing small-scale objects. Specifically, we construct a challenging testbed comprising 1889 samples across seven instruction types. In these samples, target objects occupy only 1%-10% of the image area, covering complex scenarios such as partial occlusion and multi-object editing. To ensure robust evaluation on this benchmark, we propose an evaluation protocol with refined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
