A Closer Look at Hardware-Friendly Weight Quantization
Sungmin Bae, Piotr Zielinski, Satrajit Chatterjee

TL;DR
This paper compares traditional MSQE-based and gradient-based hardware-friendly weight quantization methods for DNNs, analyzing their performance differences and proposing techniques to enhance their stability and accuracy on MobileNet models.
Contribution
It provides a detailed analysis of two main quantization approaches, identifies key sources of performance issues, and introduces improvements that boost accuracy on ImageNet.
Findings
Gradient-based methods improved by 4.0% and 3.3% on MobileNetV1 and V2.
MSQE-based methods fixed for optimization instability.
Insights into sensitivity to outliers and convergence issues.
Abstract
Quantizing a Deep Neural Network (DNN) model to be used on a custom accelerator with efficient fixed-point hardware implementations, requires satisfying many stringent hardware-friendly quantization constraints to train the model. We evaluate the two main classes of hardware-friendly quantization methods in the context of weight quantization: the traditional Mean Squared Quantization Error (MSQE)-based methods and the more recent gradient-based methods. We study the two methods on MobileNetV1 and MobileNetV2 using multiple empirical metrics to identify the sources of performance differences between the two classes, namely, sensitivity to outliers and convergence instability of the quantizer scaling factor. Using those insights, we propose various techniques to improve the performance of both quantization methods - they fix the optimization instability issues present in the MSQE-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image and Signal Denoising Methods · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Average Pooling · Dense Connections · Softmax · Global Average Pooling · MobileNetV1 · Convolution
