On Calibration of Modern Quantized Efficient Neural Networks
Joey Kuang, Alexander Wong

TL;DR
This paper investigates how quantization affects the calibration of neural networks across different architectures and datasets, highlighting the correlation between lower precision and poorer calibration, especially at 4-bit activations.
Contribution
It provides empirical insights into calibration behavior of quantized networks and evaluates temperature scaling as a calibration improvement method.
Findings
Calibration quality worsens with lower precision.
GhostNet-VGG is most robust to quantization.
Temperature scaling can improve calibration errors.
Abstract
We explore calibration properties at various precisions for three architectures: ShuffleNetv2, GhostNet-VGG, and MobileOne; and two datasets: CIFAR-100 and PathMNIST. The quality of calibration is observed to track the quantization quality; it is well-documented that performance worsens with lower precision, and we observe a similar correlation with poorer calibration. This becomes especially egregious at 4-bit activation regime. GhostNet-VGG is shown to be the most robust to overall performance drop at lower precision. We find that temperature scaling can improve calibration error for quantized networks, with some caveats. We hope that these preliminary insights can lead to more opportunities for explainable and reliable EdgeML.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
