YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

Priyanto Hidayatullah; Nurjannah Syakrani; Muhammad Rizqi Sholahuddin; Trisna Gelar; Refdinal Tubagus

arXiv:2501.13400·cs.CV·May 5, 2026

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, Refdinal Tubagus

PDF

TL;DR

This paper provides a detailed comparative review of YOLOv8 to YOLO11, analyzing their architectures to clarify differences and improvements, addressing gaps in official documentation and scholarly publications.

Contribution

It offers an in-depth architectural comparison of the latest YOLO models, based on analysis of academic papers, documentation, and source code, highlighting key similarities and differences.

Findings

01

Each YOLO version shows architectural improvements and feature extraction enhancements.

02

Some blocks in the models remain unchanged across versions.

03

Lack of official diagrams complicates understanding and future development.

Abstract

Note: This is a preliminary version of the manuscript. The final, peer-reviewed, and substantially revised version has been published in Jurnal RESTI. Readers are encouraged to access and cite the published version: DOI: https://doi.org/10.29207/resti.v10i2.6598 In the field of deep learning-based computer vision, YOLO is revolutionary. With respect to deep learning models, YOLO is also the one that is evolving the most rapidly. Unfortunately, not every YOLO model possesses scholarly publications. Moreover, there exists a YOLO model that lacks a publicly accessible official architectural diagram. Naturally, this engenders challenges, such as complicating the understanding of how the model operates in practice. Furthermore, the review articles that are presently available do not investigate the specifics of each model. The objective of this study is to present a comprehensive and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.