RemoteReasoner: Towards Unifying Geospatial Reasoning Workflow
Liang Yao, Fan Liu, Hongbo Lu, Chuanyi Zhang, Rui Min, Shengxiang Xu, Shimin Di, Pai Peng

TL;DR
RemoteReasoner is a unified geospatial reasoning framework that leverages multi-modal large language models and reinforcement learning to interpret complex spatial queries and perform diverse reasoning tasks without task-specific fine-tuning.
Contribution
It introduces a novel, unified geospatial reasoning workflow that combines MLLMs with reinforcement learning for autonomous, multi-granularity spatial reasoning tasks.
Findings
Achieves state-of-the-art performance on multi-granularity reasoning tasks.
Demonstrates robust generalization to unseen tasks and out-of-distribution categories.
Operates without task-specific decoders or additional fine-tuning.
Abstract
Remote sensing imagery presents vast, inherently unstructured spatial data, necessitating sophisticated reasoning to interpret complex user intents and contextual relationships beyond simple recognition tasks. In this paper, we aim to construct an Earth observation workflow to handle complex queries by reasoning about spatial context and user intent. As a reasoning workflow, it should autonomously explore and construct its own inference paths, rather than being confined to predefined ground-truth sequences. Ideally, its architecture ought to be unified yet generalized, possessing capabilities to perform diverse reasoning tasks through one model without requiring additional fine-tuning. Existing remote sensing approaches rely on supervised fine-tuning paradigms and task-specific heads, limiting both autonomous reasoning and unified generalization. To this end, we propose RemoteReasoner,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Constraint Satisfaction and Optimization · Remote-Sensing Image Classification
