LIDAR: Lightweight Adaptive Cue-Aware Fusion Vision Mamba for Multimodal Segmentation of Structural Cracks

Hui Liu; Chen Jia; Fan Shi; Xu Cheng; Mengfei Shi; Xia Xie; Shengyong Chen

arXiv:2507.22477·cs.CV·August 1, 2025

LIDAR: Lightweight Adaptive Cue-Aware Fusion Vision Mamba for Multimodal Segmentation of Structural Cracks

Hui Liu, Chen Jia, Fan Shi, Xu Cheng, Mengfei Shi, Xia Xie, Shengyong Chen

PDF

TL;DR

LIDAR is a lightweight, adaptive multimodal segmentation network that effectively combines morphological and textural cues for pixel-level crack detection with low computational cost.

Contribution

The paper introduces LIDAR, a novel fusion network with adaptive cue modeling and efficient modules, advancing crack segmentation performance and efficiency.

Findings

01

Outperforms state-of-the-art methods on three datasets.

02

Achieves high F1 and mIoU scores with minimal parameters.

03

Demonstrates real-time applicability due to low computational overhead.

Abstract

Achieving pixel-level segmentation with low computational cost using multimodal data remains a key challenge in crack segmentation tasks. Existing methods lack the capability for adaptive perception and efficient interactive fusion of cross-modal features. To address these challenges, we propose a Lightweight Adaptive Cue-Aware Vision Mamba network (LIDAR), which efficiently perceives and integrates morphological and textural cues from different modalities under multimodal crack scenarios, generating clear pixel-level crack segmentation maps. Specifically, LIDAR is composed of a Lightweight Adaptive Cue-Aware Visual State Space module (LacaVSS) and a Lightweight Dual Domain Dynamic Collaborative Fusion module (LD3CF). LacaVSS adaptively models crack cues through the proposed mask-guided Efficient Dynamic Guided Scanning Strategy (EDG-SS), while LD3CF leverages an Adaptive Frequency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.