A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
Babak Rokh, Ali Azarpeyvand, Alireza Khanteymoori

TL;DR
This survey comprehensively reviews various quantization techniques for deep neural networks in image classification, analyzing methods, training strategies, and benchmarks to facilitate efficient deployment on resource-constrained devices.
Contribution
It provides an extensive analysis and comparison of quantization approaches, including clustering, scale factors, training methods, and evaluation metrics, filling a gap in consolidated knowledge.
Findings
Quantization reduces memory and computation costs significantly.
State-of-the-art methods achieve high accuracy on CIFAR-10 and ImageNet.
Layer sensitivity analysis guides effective quantization strategies.
Abstract
Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory usage and energy consumption. As a result, deploying DNNs on devices with constrained hardware resources poses significant challenges. To overcome this, various compression techniques have been widely employed to optimize DNN accelerators. A promising approach is quantization, in which the full-precision values are stored in low bit-width precision. Quantization not only reduces memory requirements but also replaces high-cost operations with low-cost ones. DNN quantization offers flexibility and efficiency in hardware design, making it a widely adopted technique in various methods. Since quantization has been extensively utilized in previous works, there…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Adversarial Robustness in Machine Learning
