YOLOv11: An Overview of the Key Architectural Enhancements

Rahima Khanam; Muhammad Hussain

arXiv:2410.17725·cs.CV·October 24, 2024·476 cites

YOLOv11: An Overview of the Key Architectural Enhancements

Rahima Khanam, Muhammad Hussain

PDF

Open Access 1 Repo

TL;DR

YOLOv11 introduces architectural innovations like C3k2, SPPF, and C2PSA, significantly enhancing object detection performance and versatility across various computer vision tasks and model sizes.

Contribution

This paper provides a comprehensive analysis of YOLOv11's architectural enhancements and evaluates its improved performance and versatility in multiple computer vision applications.

Findings

01

Enhanced feature extraction with new architectural blocks

02

Improved mean Average Precision (mAP) over previous models

03

Versatility across different model sizes and tasks

Abstract

This study presents an architectural analysis of YOLOv11, the latest iteration in the YOLO (You Only Look Once) series of object detection models. We examine the models architectural innovations, including the introduction of the C3k2 (Cross Stage Partial with kernel size 2) block, SPPF (Spatial Pyramid Pooling - Fast), and C2PSA (Convolutional block with Parallel Spatial Attention) components, which contribute in improving the models performance in several ways such as enhanced feature extraction. The paper explores YOLOv11's expanded capabilities across various computer vision tasks, including object detection, instance segmentation, pose estimation, and oriented object detection (OBB). We review the model's performance improvements in terms of mean Average Precision (mAP) and computational efficiency compared to its predecessors, with a focus on the trade-off between parameter count…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ultralytics/ultralytics
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · COVID-19 diagnosis using AI

MethodsFocus