Structure-Semantic Decoupled Modulation of Global Geospatial Embeddings for High-Resolution Remote Sensing Mapping
Jienan Lyu, Miao Yang, Jinchen Cai, Yiwen Hu, Guanyi Lu, Junhao Qiu, Runmin Dong

TL;DR
This paper introduces SSDM, a novel framework that decouples global geospatial embeddings into structural and semantic pathways, significantly improving high-resolution remote sensing mapping accuracy.
Contribution
The proposed SSDM framework effectively integrates global geospatial representations with high-resolution imagery by decoupling and guiding feature extraction, enhancing mapping performance.
Findings
Achieves state-of-the-art results in high-resolution land cover mapping.
Effectively suppresses prediction fragmentation and enhances semantic consistency.
Demonstrates universal applicability across diverse remote sensing scenarios.
Abstract
Fine-grained high-resolution remote sensing mapping typically relies on localized visual features, which restricts cross-domain generalizability and often leads to fragmented predictions of large-scale land covers. While global geospatial foundation models offer powerful, generalizable representations, directly fusing their high-dimensional implicit embeddings with high-resolution visual features frequently triggers feature interference and spatial structure degradation due to a severe semantic-spatial gap. To overcome these limitations, we propose a Structure-Semantic Decoupled Modulation (SSDM) framework, which decouples global geospatial representations into two complementary cross-modal injection pathways. First, the structural prior modulation branch introduces the macroscopic receptive field priors from global representations into the self-attention modules of the high-resolution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
