PhysEditBench: A Protocol-Conditioned Benchmark for Dense Physical-Map Prediction with Image Editors
Jiaxin Yang, Yu Hou, Muxin Liu, Weixuan Liu, Ze Yuan, Zeming Chen, Zhongrui Wang, and Xiaojuan Qi

TL;DR
PhysEditBench introduces a new protocol-conditioned benchmark to evaluate image editors' ability to predict dense physical maps from single RGB images, emphasizing the role of prompts and interaction modes.
Contribution
The paper presents PhysEditBench, a novel benchmark with standardized protocols for evaluating image editors on dense physical-map prediction tasks.
Findings
Specialized models outperform image editors on depth, normal, and albedo.
Image editors can match or surpass specialized models on roughness and metallic metrics.
Structural errors and lighting sensitivity remain challenges for image editors.
Abstract
Can general-purpose image editors predict physical maps from a single RGB image? General-purpose image editors differ from standard task-specific dense-prediction models: they do not directly take an image and output a physical map. Instead, they must be guided by prompts, examples, or image-based textual cues. To this end, we introduce PhysEditBench, a novel protocol-conditioned benchmark to evaluate and standardize image editors in dense physical-map prediction that covers five targets: depth, normal, albedo, roughness, and metallic maps. For evaluation data, we build a target-dependent benchmark substrate. We use OpenRooms-FF for depth, surface normal, albedo, and roughness, InteriorVerse as an additional source for depth, normal, albedo, and a new procedurally generated source for metallic maps. We curate the data with quality checks, valid-region masks, scene-level sampling, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
