Fighting Quantization Bias With Bias

Alexander Finkelstein; Uri Almog; Mark Grobman

arXiv:1906.03193·cs.LG·June 10, 2019·20 cites

Fighting Quantization Bias With Bias

Alexander Finkelstein, Uri Almog, Mark Grobman

PDF

Open Access

TL;DR

This paper identifies a bias-induced shift in activation means caused by quantization in low-precision neural networks, and proposes simple, fast correction methods to restore performance without extensive retraining.

Contribution

It introduces a bias compensation technique for quantized neural networks, effectively correcting activation shifts with minimal data and computation, improving deployment efficiency.

Findings

01

Bias correction restores network accuracy effectively.

02

Methods require only small unlabeled data sets.

03

Performance matches training-based quantization methods.

Abstract

Low-precision representation of deep neural networks (DNNs) is critical for efficient deployment of deep learning application on embedded platforms, however, converting the network to low precision degrades its performance. Crucially, networks that are designed for embedded applications usually suffer from increased degradation since they have less redundancy. This is most evident for the ubiquitous MobileNet architecture which requires a costly quantization-aware training cycle to achieve acceptable performance when quantized to 8-bits. In this paper, we trace the source of the degradation in MobileNets to a shift in the mean activation value. This shift is caused by an inherent bias in the quantization process which builds up across layers, shifting all network statistics away from the learned distribution. We show that this phenomenon happens in other architectures as well. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning