TL;DR
SenseBench is a new benchmark designed to evaluate and improve vision-language models' ability to perceive and describe low-level visual degradations in remote sensing images, addressing domain gaps and diagnostic needs.
Contribution
It introduces the first dedicated diagnostic benchmark for remote sensing low-level perception, with a physics-based taxonomy and comprehensive evaluation protocols.
Findings
29 state-of-the-art VLMs show domain bias and perception issues.
SenseBench reveals phenomena like fluency illusion and perception-description inversion.
Benchmark facilitates advancing VLM development for remote sensing applications.
Abstract
Low-level visual perception underpins reliable remote sensing (RS) image analysis, yet current image quality assessment (IQA) methods output uninterpretable scalar scores rather than characterizing physics-driven RS degradations, deviating markedly from the diagnostic needs of RS experts. While Vision-Language Models (VLMs) present a compelling alternative by delivering language-grounded IQA, their visual priors are heavily biased toward ground-level natural images. Consequently, whether VLMs can overcome this domain gap to perceive and articulate RS artifacts remains insufficiently studied. To bridge this gap, we propose \textbf{SenseBench}, the first dedicated diagnostic benchmark for RS low-level visual perception and description. Driven by a physics-based hierarchical taxonomy that unifies both non-reference and reference-based paradigms, SenseBench features over 10K meticulously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
