GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery
Lifan Jiang, Yuhang Pei, oxi Wu, Yan Zhao, Tianrun Wu, Shulong Yu, Lihui Zhang, Deng Cai

TL;DR
GeoSeg is a training-free, reasoning-driven segmentation framework for remote sensing imagery that leverages large language models and spatial cues to achieve zero-shot localization without supervision.
Contribution
It introduces GeoSeg, a novel zero-shot, training-free approach combining LLM reasoning with spatial cues for remote sensing segmentation, and provides a new diagnostic benchmark.
Findings
GeoSeg outperforms baseline methods in remote sensing segmentation tasks.
The dual-route prompting mechanism effectively fuses semantic and spatial information.
Component ablations confirm the importance of each part of GeoSeg.
Abstract
Recent advances in MLLMs are reframing segmentation from fixed-category prediction to instruction-grounded localization. While reasoning based segmentation has progressed rapidly in natural scenes, remote sensing lacks a generalizable solution due to the prohibitive cost of reasoning-oriented data and domain-specific challenges like overhead viewpoints. We present GeoSeg, a zero-shot, training-free framework that bypasses the supervision bottleneck for reasoning-driven remote sensing segmentation. GeoSeg couples MLLM reasoning with precise localization via: (i) bias-aware coordinate refinement to correct systematic grounding shifts and (ii) a dual-route prompting mechanism to fuse semantic intent with fine-grained spatial cues. We also introduce GeoSeg-Bench, a diagnostic benchmark of 810 image--query pairs with hierarchical difficulty levels. Experiments show that GeoSeg consistently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Remote-Sensing Image Classification
