UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing Images
Lulin Li, Ben Chen, Xuechao Zou, Junliang Xing, Pin Tao

TL;DR
UV-Mamba is a neural network model that accurately identifies urban village boundaries in high-resolution remote sensing images, overcoming memory issues with deformable convolutions and achieving state-of-the-art results.
Contribution
The paper introduces UV-Mamba, a novel neural network architecture that enhances boundary detection in remote sensing images using deformable state space models.
Findings
Achieves 73.3% and 78.1% IoU on Beijing and Xi'an datasets.
Outperforms previous models by 1.2% and 3.4% IoU.
6x faster inference and 40x smaller model size.
Abstract
Due to the diverse geographical environments, intricate landscapes, and high-density settlements, the automatic identification of urban village boundaries using remote sensing images remains a highly challenging task. This paper proposes a novel and efficient neural network model called UV-Mamba for accurate boundary detection in high-resolution remote sensing images. UV-Mamba mitigates the memory loss problem in lengthy sequence modeling, which arises in state space models with increasing image size, by incorporating deformable convolutions. Its architecture utilizes an encoder-decoder framework and includes an encoder with four deformable state space augmentation blocks for efficient multi-level semantic extraction and a decoder to integrate the extracted semantic information. We conducted experiments on two large datasets showing that UV-Mamba achieves state-of-the-art performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Remote Sensing and Land Use · Remote Sensing in Agriculture
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
