Q-YOLO: Efficient Inference for Real-time Object Detection
Mingze Wang, Huixin Sun, Jun Shi, Xuhui Liu, Baochang Zhang, Xianbin, Cao

TL;DR
Q-YOLO introduces a low-bit quantization approach with a novel histogram-based scheme to enable real-time object detection on resource-constrained devices, maintaining high accuracy with reduced computational costs.
Contribution
The paper presents a fully end-to-end Post-Training Quantization pipeline with a Unilateral Histogram-based scheme for efficient YOLO model deployment.
Findings
Outperforms other PTQ methods on COCO dataset.
Achieves a better accuracy-computation trade-off.
Enables real-time detection on edge devices.
Abstract
Real-time object detection plays a vital role in various computer vision applications. However, deploying real-time object detectors on resource-constrained platforms poses challenges due to high computational and memory requirements. This paper describes a low-bit quantization method to build a highly efficient one-stage detector, dubbed as Q-YOLO, which can effectively address the performance degradation problem caused by activation distribution imbalance in traditional quantized YOLO models. Q-YOLO introduces a fully end-to-end Post-Training Quantization (PTQ) pipeline with a well-designed Unilateral Histogram-based (UH) activation quantization scheme, which determines the maximum truncation values through histogram analysis by minimizing the Mean Squared Error (MSE) quantization errors. Extensive experiments on the COCO dataset demonstrate the effectiveness of Q-YOLO, outperforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · CCD and CMOS Imaging Sensors
