RSGMamba: Reliability-Aware Self-Gated State Space Model for Multimodal Semantic Segmentation
Guoan Xu, Yang Xiao, Guangwei Gao, Dongchen Zhu, Guo-Jun Qi, Wenjing Jia

TL;DR
This paper introduces RSGMamba, a novel reliability-aware multimodal fusion framework for semantic segmentation that dynamically models modality reliability to improve accuracy and robustness.
Contribution
It proposes the RSGMB block and LCGM module for explicit reliability modeling and adaptive feature fusion, outperforming existing methods.
Findings
Achieves state-of-the-art results on RGB-D and RGB-T benchmarks.
Improves mIoU by up to 1.6% over prior methods.
Operates with only 48.6M parameters, demonstrating efficiency.
Abstract
Multimodal semantic segmentation has emerged as a powerful paradigm for enhancing scene understanding by leveraging complementary information from multiple sensing modalities (e.g., RGB, depth, and thermal). However, existing cross-modal fusion methods often implicitly assume that all modalities are equally reliable, which can lead to feature degradation when auxiliary modalities are noisy, misaligned, or incomplete. In this paper, we revisit cross-modal fusion from the perspective of modality reliability and propose a novel framework termed the Reliability-aware Self-Gated State Space Model (RSGMamba). At the core of our method is the Reliability-aware Self-Gated Mamba Block (RSGMB), which explicitly models modality reliability and dynamically regulates cross-modal interactions through a self-gating mechanism. Unlike conventional fusion strategies that indiscriminately exchange…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
