YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection
Olalekan Akindele, Joshua Atolagbe

TL;DR
This paper introduces YOLO-ELA, an improved real-time insulator defect detection model using local attention mechanisms, achieving high accuracy and speed on UAV imagery despite limited data.
Contribution
The paper proposes YOLO-ELA with Efficient Local Attention blocks and SIoU loss, enhancing small object detection and real-time performance in UAV-based insulator defect identification.
Findings
Achieved 96.9% mAP0.5 on UAV images.
Real-time detection at 74.63 FPS.
Outperformed baseline models in accuracy and speed.
Abstract
Existing detection methods for insulator defect identification from unmanned aerial vehicles (UAV) struggle with complex background scenes and small objects, leading to suboptimal accuracy and a high number of false positives detection. Using the concept of local attention modeling, this paper proposes a new attention-based foundation architecture, YOLO-ELA, to address this issue. The Efficient Local Attention (ELA) blocks were added into the neck part of the one-stage YOLOv8 architecture to shift the model's attention from background features towards features of insulators with defects. The SCYLLA Intersection-Over-Union (SIoU) criterion function was used to reduce detection loss, accelerate model convergence, and increase the model's sensitivity towards small insulator defects, yielding higher true positive outcomes. Due to a limited dataset, data augmentation techniques were utilized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Advanced Neural Network Applications · Image Enhancement Techniques
MethodsSoftmax · Attention Is All You Need · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · You Only Look Once
