MSCrackMamba: Leveraging Vision Mamba for Crack Detection in Fused Multispectral Imagery
Qinfeng Zhu, Yuan Fang, Lei Fan

TL;DR
This paper introduces MSCrackMamba, a novel two-stage approach combining super-resolution and Vision Mamba architecture to improve crack detection in fused multispectral images, addressing resolution mismatch and enhancing detection accuracy.
Contribution
It proposes a new two-stage paradigm using super-resolution and Vision Mamba for better multispectral crack detection, outperforming existing methods on a large dataset.
Findings
Achieved 3.55% higher mIoU than baseline methods.
Effectively aligns IR and RGB channels through super-resolution.
Demonstrates improved crack detection accuracy on Crack900 dataset.
Abstract
Crack detection is a critical task in structural health monitoring, aimed at assessing the structural integrity of bridges, buildings, and roads to prevent potential failures. Vision-based crack detection has become the mainstream approach due to its ease of implementation and effectiveness. Fusing infrared (IR) channels with red, green and blue (RGB) channels can enhance feature representation and thus improve crack detection. However, IR and RGB channels often differ in resolution. To align them, higher-resolution RGB images typically need to be downsampled to match the IR image resolution, which leads to the loss of fine details. Moreover, crack detection performance is restricted by the limited receptive fields and high computational complexity of traditional image segmentation networks. Inspired by the recently proposed Mamba neural architecture, this study introduces a two-stage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · ALIGN
