YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang, Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, Yiduo Li, Bo Zhang,, Yufei Liang, Linyuan Zhou, Xiaoming Xu, Xiangxiang Chu, Xiaoming Wei, Xiaolin, Wei

TL;DR
YOLOv6 introduces a suite of optimized, deployment-ready object detection networks tailored for industrial applications, achieving state-of-the-art accuracy and speed on the COCO dataset across various scales.
Contribution
It presents a new version of YOLO, YOLOv6, with improved network design, training, and optimization techniques, specifically targeting industrial deployment scenarios.
Findings
YOLOv6-N achieves 35.9% AP at 1234 FPS.
YOLOv6-S achieves 43.5% AP at 495 FPS.
Quantized YOLOv6-S reaches 43.3% AP at 869 FPS.
Abstract
For years, the YOLO series has been the de facto industry-level standard for efficient object detection. The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios. In this technical report, we strive to push its limits to the next level, stepping forward with an unwavering mindset for industry application. Considering the diverse requirements for speed and accuracy in the real environment, we extensively examine the up-to-date object detection advancements either from industry or academia. Specifically, we heavily assimilate ideas from recent network design, training strategies, testing techniques, quantization, and optimization methods. On top of this, we integrate our thoughts and practice to build a suite of deployment-ready networks at various scales to accommodate diversified use cases. With the generous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
