In-Hindsight Quantization Range Estimation for Quantized Training

Marios Fournarakis; Markus Nagel

arXiv:2105.04246·cs.LG·May 11, 2021

In-Hindsight Quantization Range Estimation for Quantized Training

Marios Fournarakis, Markus Nagel

PDF

TL;DR

This paper introduces in-hindsight range estimation for quantized training, offering a simple, fast, and hardware-friendly alternative to dynamic quantization that improves gradient and activation quantization during neural network training.

Contribution

It proposes a novel static range estimation method using past iteration data, reducing memory overhead and complexity compared to dynamic quantization in fully quantized training.

Findings

01

Effective across various architectures including MobileNetV2

02

Achieves comparable or better accuracy than existing methods

03

Reduces memory and computational overhead during training

Abstract

Quantization techniques applied to the inference of deep neural networks have enabled fast and efficient execution on resource-constraint devices. The success of quantization during inference has motivated the academic community to explore fully quantized training, i.e. quantizing back-propagation as well. However, effective gradient quantization is still an open problem. Gradients are unbounded and their distribution changes significantly during training, which leads to the need for dynamic quantization. As we show, dynamic quantization can lead to significant memory overhead and additional data traffic slowing down training. We propose a simple alternative to dynamic quantization, in-hindsight range estimation, that uses the quantization ranges estimated on previous iterations to quantize the present. Our approach enables fast static quantization of gradients and activations while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDepthwise Convolution · Pointwise Convolution · Batch Normalization · Depthwise Separable Convolution · Inverted Residual Block · Convolution · 1x1 Convolution · Average Pooling