Optimizing YOLO Architectures for Optimal Road Damage Detection and   Classification: A Comparative Study from YOLOv7 to YOLOv10

Vung Pham; Lan Dong Thi Ngoc; and Duy-Linh Bui

arXiv:2410.08409·cs.CV·October 14, 2024·2 cites

Optimizing YOLO Architectures for Optimal Road Damage Detection and Classification: A Comparative Study from YOLOv7 to YOLOv10

Vung Pham, Lan Dong Thi Ngoc, and Duy-Linh Bui

PDF

Open Access

TL;DR

This study compares various YOLO architectures, including custom and lightweight models, for efficient and accurate road damage detection, emphasizing inference speed optimization and dataset augmentation.

Contribution

It introduces a combined YOLOv7-based approach with Coordinate Attention and model reparameterization for improved detection performance and speed.

Findings

01

Ensemble model achieves F1 score of 0.7027

02

Inference speed of 0.0547 seconds per image

03

Incorporation of external pothole dataset enhances detection

Abstract

Maintaining roadway infrastructure is essential for ensuring a safe, efficient, and sustainable transportation system. However, manual data collection for detecting road damage is time-consuming, labor-intensive, and poses safety risks. Recent advancements in artificial intelligence, particularly deep learning, offer a promising solution for automating this process using road images. This paper presents a comprehensive workflow for road damage detection using deep learning models, focusing on optimizations for inference speed while preserving detection accuracy. Specifically, to accommodate hardware limitations, large images are cropped, and lightweight models are utilized. Additionally, an external pothole dataset is incorporated to enhance the detection of this underrepresented damage class. The proposed approach employs multiple model architectures, including a custom YOLOv7 model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrastructure Maintenance and Monitoring · Advanced Neural Network Applications · Industrial Vision Systems and Defect Detection

MethodsSoftmax · Attention Is All You Need · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Coordinate attention