DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation

Boyi Li; Ce Zhang; Richard M. Timmerman; Wenxuan Bao

arXiv:2509.00598·cs.CV·November 12, 2025

DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation

Boyi Li, Ce Zhang, Richard M. Timmerman, Wenxuan Bao

PDF

Open Access

TL;DR

DGL-RSIS is a training-free framework that decouples global and local features to enable remote sensing image segmentation by leveraging vision-language models without additional training.

Contribution

It introduces the first unified training-free approach that transfers vision-language models to remote sensing segmentation by decoupling spatial context and class semantics.

Findings

01

Outperforms existing training-free methods on benchmarks

02

Effectively handles open-vocabulary and referring expression segmentation

03

Validates each module's contribution through ablation studies

Abstract

The emergence of vision language models (VLMs) bridges the gap between vision and language, enabling multimodal understanding beyond traditional visual-only deep learning models. However, transferring VLMs from the natural image domain to remote sensing (RS) segmentation remains challenging due to the large domain gap and the diversity of RS inputs across tasks, particularly in open-vocabulary semantic segmentation (OVSS) and referring expression segmentation (RES). Here, we propose a training-free unified framework, termed DGL-RSIS, which decouples visual and textual representations and performs visual-language alignment at both local semantic and global contextual levels. Specifically, a Global-Local Decoupling (GLD) module decomposes textual inputs into local semantic tokens and global contextual tokens, while image inputs are partitioned into class-agnostic mask proposals. Then, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications