Geospatial-Reasoning-Driven Vocabulary-Agnostic Remote Sensing Semantic Segmentation
Chufeng Zhou, Jian Wang, Xinyuan Liu, and Xiaokang Zhang

TL;DR
This paper introduces a geospatial reasoning framework that enhances open-vocabulary remote sensing segmentation by addressing semantic ambiguity through knowledge distillation and instance reasoning, improving accuracy in complex scenes.
Contribution
It proposes the GR-CoT framework combining offline knowledge distillation and online reasoning to generate adaptive vocabularies for better segmentation.
Findings
Improves segmentation performance on LoveDA and GID5 benchmarks.
Produces more semantically coherent predictions in complex scenes.
Abstract
Open-vocabulary semantic segmentation has become an important direction in remote sensing, as it enables recognition beyond predefined land-cover categories. However, existing methods mainly depend on passive visual-text matching and often struggle with semantic ambiguity in geographically complex scenes, especially when different classes exhibit similar spectral or structural patterns. To address this issue, we propose a Geospatial Reasoning Chain-of-Thought (GR-CoT) framework for remote sensing open-vocabulary semantic segmentation. GR-CoT consists of an offline knowledge distillation stream and an online instance reasoning stream. The former constructs category interpretation standards for confusing classes, while the latter performs macro-scenario anchoring, visual feature decoupling, and knowledge-driven decision synthesis to generate an image-adaptive vocabulary for downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Multimodal Machine Learning Applications · Geographic Information Systems Studies
