YOLO-Z: Improving small object detection in YOLOv5 for autonomous   vehicles

Aduen Benjumea; Izzeddin Teeti; Fabio Cuzzolin; Andrew Bradley

arXiv:2112.11798·cs.CV·January 4, 2023·117 cites

YOLO-Z: Improving small object detection in YOLOv5 for autonomous vehicles

Aduen Benjumea, Izzeddin Teeti, Fabio Cuzzolin, Andrew Bradley

PDF

Open Access 2 Repos

TL;DR

This paper introduces YOLO-Z, a modified version of YOLOv5 optimized for small object detection in autonomous vehicles, achieving up to 6.9% higher accuracy with minimal inference time increase.

Contribution

The study proposes a series of structural modifications to YOLOv5, creating YOLO-Z models that enhance small object detection performance for autonomous vehicle applications.

Findings

01

Up to 6.9% improvement in mAP for small objects at 50% IOU

02

3ms increase in inference time

03

Structural modifications impact detection accuracy and speed

Abstract

As autonomous vehicles and autonomous racing rise in popularity, so does the need for faster and more accurate detectors. While our naked eyes are able to extract contextual information almost instantly, even from far away, image resolution and computational resources limitations make detecting smaller objects (that is, objects that occupy a small pixel area in the input image) a genuinely challenging task for machines and a wide-open research field. This study explores how the popular YOLOv5 object detector can be modified to improve its performance in detecting smaller objects, with a particular application in autonomous racing. To achieve this, we investigate how replacing certain structural elements of the model (as well as their connections and other parameters) can affect performance and inference time. In doing so, we propose a series of models at different scales, which we name…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Age of Information Optimization