CrackESS: A Self-Prompting Crack Segmentation System for Edge Devices
Yingchu Wang, Ji He, Shijie Yu

TL;DR
CrackESS is a lightweight, self-prompting crack segmentation system designed for edge devices, combining YOLOv8 and SAM models with a refinement module to improve accuracy and efficiency in structural health monitoring.
Contribution
The paper introduces a novel crack segmentation system that integrates self-prompting and refinement techniques, optimized for edge device deployment in infrastructure monitoring.
Findings
Effective crack segmentation on multiple datasets
Demonstrated deployment on a climbing robot system
Outperforms existing methods in efficiency and accuracy
Abstract
Structural Health Monitoring (SHM) is a sustainable and essential approach for infrastructure maintenance, enabling the early detection of structural defects. Leveraging computer vision (CV) methods for automated infrastructure monitoring can significantly enhance monitoring efficiency and precision. However, these methods often face challenges in efficiency and accuracy, particularly in complex environments. Recent CNN-based and SAM-based approaches have demonstrated excellent performance in crack segmentation, but their high computational demands limit their applicability on edge devices. This paper introduces CrackESS, a novel system for detecting and segmenting concrete cracks. The approach first utilizes a YOLOv8 model for self-prompting and a LoRA-based fine-tuned SAM model for crack segmentation, followed by refining the segmentation masks through the proposed Crack Mask…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Destructive Testing Techniques · Industrial Vision Systems and Defect Detection · Welding Techniques and Residual Stresses
MethodsSegment Anything Model · You Only Look Once · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
