Think and Answer ME: Benchmarking and Exploring Multi-Entity Reasoning Grounding in Remote Sensing
Shuchang Lyu, Haiquan Wen, Guangliang Cheng, Meng Li, Zheng Zhou, You Zhou, Dingding Yao, Zhenwei Shi

TL;DR
This paper introduces a new benchmark dataset and a reasoning framework for multi-entity reasoning in remote sensing, advancing the explicit reasoning capabilities beyond perception-level matching.
Contribution
It presents the ME-RSRG dataset and the Entity-Aware Reasoning (EAR) framework, enabling structured multi-entity reasoning in remote sensing grounding tasks.
Findings
The ME-RSRG dataset reveals the complexity of multi-entity reasoning in remote sensing.
The EAR framework effectively generates structured reasoning traces and grounding outputs.
Experimental results demonstrate the superiority of EAR over baseline methods.
Abstract
Recent advances in reasoning language models and reinforcement learning with verifiable rewards have significantly enhanced multi-step reasoning capabilities. This progress motivates the extension of reasoning paradigms to remote sensing visual grounding task. However, existing remote sensing grounding methods remain largely confined to perception-level matching and single-entity formulations, limiting the role of explicit reasoning and inter-entity modeling. To address this challenge, we introduce a new benchmark dataset for Multi-Entity Reasoning Grounding in Remote Sensing (ME-RSRG). Based on ME-RSRG, we reformulate remote sensing grounding as a multi-entity reasoning task and propose an Entity-Aware Reasoning (EAR) framework built upon visual-linguistic foundation models. EAR generates structured reasoning traces and subject-object grounding outputs. It adopts supervised fine-tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks
