MMFNet: A Mamba-Based Multimodal Fusion Network for Remote Sensing Image Semantic Segmentation
Jingting Qiu, Wei Chang, Wei Ren, Shanshan Hou, Ronghao Yang

TL;DR
MMFNet is a new network for remote sensing image segmentation that combines optical and elevation data to improve accuracy and efficiency.
Contribution
MMFNet introduces a dual-encoder Mamba-based architecture with a novel multimodal fusion block and frequency-aware upsampling for remote sensing.
Findings
MMFNet achieved 83.50% mean IoU on the ISPRS Vaihingen benchmark.
The model outperformed eight state-of-the-art methods with low computational complexity.
The MFFB and FreqFusion modules improved boundary delineation and feature integration.
Abstract
Accurate semantic segmentation of high-resolution remote sensing imagery is challenged by substantial intra-class variability, inter-class similarity, and the limitations of single-modality data. This paper proposes MMFNet, a novel multimodal fusion network that leverages the Mamba architecture to efficiently capture long-range dependencies for semantic segmentation tasks. MMFNet adopts a dual-encoder design, combining ResNet-18 for local detail extraction and VMamba for global contextual modelling, striking a balance between segmentation accuracy and computational efficiency. A Multimodal Feature Fusion Block (MFFB) is introduced to effectively integrate complementary information from optical imagery and digital surface models (DSMs), thereby enhancing multimodal feature interaction and improving segmentation accuracy. Furthermore, a frequency-aware upsampling module (FreqFusion) is…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Remote-Sensing Image Classification
