VLRS-Bench: A Vision-Language Reasoning Benchmark for Remote Sensing

Zhiming Luo; Di Wang; Haonan Guo; Jing Zhang; Bo Du

arXiv:2602.07045·cs.CV·May 15, 2026

VLRS-Bench: A Vision-Language Reasoning Benchmark for Remote Sensing

Zhiming Luo, Di Wang, Haonan Guo, Jing Zhang, Bo Du

PDF

1 Repo 1 Datasets

TL;DR

VLRS-Bench is a novel benchmark designed to evaluate complex reasoning in remote sensing using multimodal large language models, addressing the gap in perception-focused RS benchmarks.

Contribution

It introduces the first comprehensive remote sensing reasoning benchmark with 2,000 questions across diverse tasks and phases, constructed with RS-specific priors and expert knowledge.

Findings

01

Existing MLLMs show significant bottlenecks in complex RS reasoning.

02

VLRS-Bench covers 14 tasks and 8 temporal phases.

03

The benchmark highlights critical gaps in current multimodal reasoning capabilities.

Abstract

Recent advancements in Multimodal Large Language Models (MLLMs) have enabled complex reasoning. However, existing remote sensing (RS) benchmarks remain heavily biased toward perception tasks, such as object recognition and scene classification. This limitation hinders the development of MLLMs for cognitively demanding RS applications. To address this, we propose a Vision Language ReaSoning Benchmark (VLRS-Bench), which is the first benchmark exclusively dedicated to complex RS reasoning. Structured across the three core dimensions of Cognition, Decision, and Prediction, VLRS-Bench comprises 2,000 question-answer pairs with an average question length of 130.19 words, spanning 14 tasks and up to eight temporal phases. VLRS-Bench is constructed via a specialized pipeline that integrates RS-specific priors and expert knowledge to ensure geospatial realism and reasoning complexity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MiliLab/VLRS-Bench
github

Datasets

thislzm/VLRS-Bench
dataset· 1.8k dl
1.8k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.