A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS
Juan Terven, Diana Cordova-Esparza

TL;DR
This paper provides a comprehensive review of the evolution of YOLO object detection architectures from YOLOv1 to YOLOv8, YOLO-NAS, and YOLO with Transformers, highlighting innovations, training techniques, and future directions.
Contribution
It offers an in-depth analysis of each YOLO version's architectural changes, training strategies, and key contributions, serving as a valuable resource for future research.
Findings
Detailed comparison of YOLO architectures and their improvements
Identification of key innovations in each YOLO version
Insights into future research directions for real-time detection
Abstract
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO's evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with Transformers. We start by describing the standard metrics and postprocessing; then, we discuss the major changes in network architecture and training tricks for each model. Finally, we summarize the essential lessons from YOLO's development and provide a perspective on its future, highlighting potential research directions to enhance real-time object detection systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Brain Tumor Detection and Classification
MethodsYou Only Look Once
