MSCrackMamba: Leveraging Vision Mamba for Crack Detection in Fused Multispectral Imagery

Qinfeng Zhu; Yuan Fang; Lei Fan

arXiv:2412.06211·cs.CV·January 26, 2026

MSCrackMamba: Leveraging Vision Mamba for Crack Detection in Fused Multispectral Imagery

Qinfeng Zhu, Yuan Fang, Lei Fan

PDF

TL;DR

This paper introduces MSCrackMamba, a novel two-stage approach combining super-resolution and Vision Mamba architecture to improve crack detection in fused multispectral images, addressing resolution mismatch and enhancing detection accuracy.

Contribution

It proposes a new two-stage paradigm using super-resolution and Vision Mamba for better multispectral crack detection, outperforming existing methods on a large dataset.

Findings

01

Achieved 3.55% higher mIoU than baseline methods.

02

Effectively aligns IR and RGB channels through super-resolution.

03

Demonstrates improved crack detection accuracy on Crack900 dataset.

Abstract

Crack detection is a critical task in structural health monitoring, aimed at assessing the structural integrity of bridges, buildings, and roads to prevent potential failures. Vision-based crack detection has become the mainstream approach due to its ease of implementation and effectiveness. Fusing infrared (IR) channels with red, green and blue (RGB) channels can enhance feature representation and thus improve crack detection. However, IR and RGB channels often differ in resolution. To align them, higher-resolution RGB images typically need to be downsampled to match the IR image resolution, which leads to the loss of fine details. Moreover, crack detection performance is restricted by the limited receptive fields and high computational complexity of traditional image segmentation networks. Inspired by the recently proposed Mamba neural architecture, this study introduces a two-stage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · ALIGN