YOLO-Drone:Airborne real-time detection of dense small objects from high-altitude perspective
Li Zhu, Jiahui Xiong, Feng Xiong, Hanzheng Hu, Zhengnan Jiang

TL;DR
YOLO-Drone is a real-time UAV object detection algorithm that introduces novel backbone and feature aggregation modules, achieving superior accuracy and speed, especially in night conditions with special lighting.
Contribution
The paper proposes YOLO-Drone with new backbone Darknet59, MSPP-FPN module, and GIoU loss, advancing small object detection in UAV imagery under various lighting conditions.
Findings
Outperforms SOTA methods with 10.13% and 8.59% mAP improvements on UAVDT and VisDrone datasets.
Achieves 53 FPS inference speed, suitable for real-time UAV applications.
High performance under special silicon-based golden LEDs, with up to 87.71% mAP.
Abstract
Unmanned Aerial Vehicles (UAVs), specifically drones equipped with remote sensing object detection technology, have rapidly gained a broad spectrum of applications and emerged as one of the primary research focuses in the field of computer vision. Although UAV remote sensing systems have the ability to detect various objects, small-scale objects can be challenging to detect reliably due to factors such as object size, image degradation, and real-time limitations. To tackle these issues, a real-time object detection algorithm (YOLO-Drone) is proposed and applied to two new UAV platforms as well as a specific light source (silicon-based golden LED). YOLO-Drone presents several novelties: 1) including a new backbone Darknet59; 2) a new complex feature aggregation module MSPP-FPN that incorporated one spatial pyramid pooling and three atrous spatial pyramid pooling modules; 3) and the use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
MethodsSpatial Pyramid Pooling · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
