Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency
Mahila Moghadami, Mohammad Ali Keyvanrad, Melika Sabaghian

TL;DR
This paper introduces a novel YOLO-based framework with architectural enhancements and improved cropping techniques to significantly boost small object detection accuracy in aerial images, outperforming existing methods like SAHI and CZDet.
Contribution
The paper presents a new model that combines advanced feature extraction, attention mechanisms, and a specialized detection head to improve small object detection accuracy in large-scale aerial imagery.
Findings
Achieved 61.2% mAP on VisDrone2019 dataset, surpassing YOLOv5L baseline.
Outperformed SAHI and CZDet frameworks in accuracy.
Demonstrated significant improvements in small object detection performance.
Abstract
This paper investigates and develops methods for detecting small objects in large-scale aerial images. Current approaches for detecting small objects in aerial images often involve image cropping and modifications to detector network architectures. Techniques such as sliding window cropping and architectural enhancements, including higher-resolution feature maps and attention mechanisms, are commonly employed. Given the growing importance of aerial imagery in various critical and industrial applications, the need for robust frameworks for small object detection becomes imperative. To address this need, we adopted the base SW-YOLO approach to enhance speed and accuracy in small object detection by refining cropping dimensions and overlap in sliding window usage and subsequently enhanced it through architectural modifications. we propose a novel model by modifying the base model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
