Real-Time Flying Object Detection with YOLOv8

Dillon Reis; Jordan Kupec; Jacqueline Hong; Ahmad Daoudi

arXiv:2305.09972·cs.CV·May 24, 2024·340 cites

Real-Time Flying Object Detection with YOLOv8

Dillon Reis, Jordan Kupec, Jacqueline Hong, Ahmad Daoudi

PDF

Open Access 2 Repos 3 Models

TL;DR

This paper develops a real-time flying object detection model using YOLOv8, achieving high accuracy and speed through transfer learning and a generalized training approach, suitable for real-world applications.

Contribution

The paper introduces a generalized YOLOv8-based model trained on diverse flying objects and refines it via transfer learning for improved real-world detection performance.

Findings

01

Refined model achieves 99.1% mAP50 accuracy.

02

Generalized model reaches 79.2% mAP50 with 50 fps.

03

Effective handling of occlusion and small object detection.

Abstract

This paper presents a generalized model for real-time detection of flying objects that can be used for transfer learning and further research, as well as a refined model that achieves state-of-the-art results for flying object detection. We achieve this by training our first (generalized) model on a data set containing 40 different classes of flying objects, forcing the model to extract abstract feature representations. We then perform transfer learning with these learned parameters on a data set more representative of real world environments (i.e. higher frequency of occlusion, very small spatial sizes, rotations, etc.) to generate our refined model. Object detection of flying objects remains challenging due to large variances of object spatial sizes/aspect ratios, rate of speed, occlusion, and clustered backgrounds. To address some of the presented challenges while simultaneously…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Human Pose and Action Recognition

MethodsYou Only Look Once · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings