PICABench: How Far Are We from Physically Realistic Image Editing?

Yuandong Pu; Le Zhuo; Songhao Han; Jinbo Xing; Kaiwen Zhu; Shuo Cao; Bin Fu; Si Liu; Hongsheng Li; Yu Qiao; Wenlong Zhang; Xi Chen; Yihao Liu

arXiv:2510.17681·cs.CV·January 6, 2026

PICABench: How Far Are We from Physically Realistic Image Editing?

Yuandong Pu, Le Zhuo, Songhao Han, Jinbo Xing, Kaiwen Zhu, Shuo Cao, Bin Fu, Si Liu, Hongsheng Li, Yu Qiao, Wenlong Zhang, Xi Chen, Yihao Liu

PDF

Open Access 1 Datasets

TL;DR

This paper introduces PICABench, a comprehensive benchmark for evaluating physical realism in image editing, highlighting current limitations and proposing solutions to improve physical consistency in generated images.

Contribution

The paper presents PICABench for systematic evaluation of physical realism in image editing and introduces PICAEval, a new evaluation protocol using vision-language models and human annotations.

Findings

01

Physical realism in image editing remains a significant challenge.

02

Most mainstream models do not adequately capture physical effects.

03

Learning physics from videos can improve physical realism.

Abstract

Image editing has achieved remarkable progress recently. Modern editing models could already follow complex instructions to manipulate the original content. However, beyond completing the editing instructions, the accompanying physical effects are the key to the generation realism. For example, removing an object should also remove its shadow, reflections, and interactions with nearby objects. Unfortunately, existing models and benchmarks mainly focus on instruction completion but overlook these physical effects. So, at this moment, how far are we from physically realistic image editing? To answer this, we introduce PICABench, which systematically evaluates physical realism across eight sub-dimension (spanning optics, mechanics, and state transitions) for most of the common editing operations (add, remove, attribute change, etc.). We further propose the PICAEval, a reliable evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Andrew613/PICA-100K
dataset· 2.6k dl
2.6k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection