SGMA: Semantic-Guided Modality-Aware Segmentation for Remote Sensing with Incomplete Multimodal Data
Lekang Wen, Liang Liao, Jing Xiao, Mi Wang

TL;DR
This paper introduces SGMA, a novel framework for remote sensing segmentation that effectively handles incomplete multimodal data by balancing modalities and reducing intra-class variation through semantic guidance.
Contribution
SGMA proposes two plug-and-play modules, Semantic-Guided Fusion and Modality-Aware Sampling, to improve multimodal learning with incomplete data by addressing imbalance and heterogeneity.
Findings
SGMA outperforms existing methods across multiple datasets.
Significant improvements in fragile modalities.
Effective mitigation of intra-class variation and cross-modal conflicts.
Abstract
Multimodal semantic segmentation integrates complementary information from diverse sensors for remote sensing Earth observation. However, practical systems often encounter missing modalities due to sensor failures or incomplete coverage, termed Incomplete Multimodal Semantic Segmentation (IMSS). IMSS faces three key challenges: (1) multimodal imbalance, where dominant modalities suppress fragile ones; (2) intra-class variation in scale, shape, and orientation across modalities; and (3) cross-modal heterogeneity with conflicting cues producing inconsistent semantic responses. Existing methods rely on contrastive learning or joint optimization, which risk over-alignment, discarding modality-specific cues or imbalanced training, favoring robust modalities, while largely overlooking intra-class variation and cross-modal heterogeneity. To address these limitations, we propose the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote-Sensing Image Classification · Domain Adaptation and Few-Shot Learning
