TL;DR
UNetMamba is a lightweight, efficient UNet-like model that improves semantic segmentation accuracy of high-resolution remote sensing images by incorporating a novel decoder and local supervision, outperforming state-of-the-art methods.
Contribution
The paper introduces UNetMamba, a novel UNet-like model with a mamba segmentation decoder and local supervision module, enhancing efficiency and accuracy in high-resolution remote sensing image segmentation.
Findings
Outperforms state-of-the-art methods in mIoU on LoveDA and ISPRS Vaihingen datasets.
Achieves high efficiency with lightweight design, less memory, and lower computational cost.
Demonstrates significant accuracy improvements over existing Transformer-based methods.
Abstract
Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning and disaster assessment.Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen, while achieving high efficiency through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
