TL;DR
The paper introduces ARG-Mamba, a novel state space model framework that enhances multi-source remote sensing image segmentation by effectively modeling multi-scale context and cross-modal feature fusion.
Contribution
It proposes a Multi-Scale State Space Module and an Axial-Relation Guided Fusion Module for improved optical-elevation image segmentation.
Findings
Outperforms existing methods on ISPRS Vaihingen and Potsdam datasets.
Achieves better segmentation accuracy with efficient computation.
Demonstrates robustness in complex high-resolution scenes.
Abstract
Semantic segmentation of multi-source remote sensing images is a fundamental task for Earth observation applications. Existing methods often struggle with insufficient multi-scale context modeling and suboptimal cross-modal feature fusion, limiting their performance in complex high-resolution scenes. To this end, we propose Axial-Relation Guided Fusion Mamba (ARG-Mamba), a state space model-based framework for optical-elevation remote sensing image segmentation. Specifically, we introduce a Multi-Scale State Space Module to capture both fine-grained local details and global contextual dependencies with linear computational complexity. Moreover, an Axial-Relation Guided Fusion Module is designed to explicitly model global cross-modal correlations along horizontal and vertical axes, enabling efficient feature fusion between optical and elevation modalities. Extensive experiments conducted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
