A Generalized Zero-Shot Quantization of Deep Convolutional Neural   Networks via Learned Weights Statistics

Prasen Kumar Sharma; Arun Abraham; Vikram Nelvoy Rajendiran

arXiv:2112.02834·cs.CV·December 14, 2021

A Generalized Zero-Shot Quantization of Deep Convolutional Neural Networks via Learned Weights Statistics

Prasen Kumar Sharma, Arun Abraham, Vikram Nelvoy Rajendiran

PDF

TL;DR

This paper introduces a novel zero-shot quantization method for deep CNNs that leverages pretrained weight distributions and data distillation, eliminating the need for original data or BN statistics, and significantly improves accuracy.

Contribution

It proposes the first generalized zero-shot quantization framework using pretrained weights for activation range calibration, applicable to networks without BN layers.

Findings

01

Outperforms existing zero-shot methods by ~33% in accuracy on MobileNetV2.

02

Effective across multiple models and open-source frameworks.

03

First to address post-training zero-shot quantization for unnormalized networks.

Abstract

Quantizing the floating-point weights and activations of deep convolutional neural networks to fixed-point representation yields reduced memory footprints and inference time. Recently, efforts have been afoot towards zero-shot quantization that does not require original unlabelled training samples of a given task. These best-published works heavily rely on the learned batch normalization (BN) parameters to infer the range of the activations for quantization. In particular, these methods are built upon either empirical estimation framework or the data distillation approach, for computing the range of the activations. However, the performance of such schemes severely degrades when presented with a network that does not accommodate BN layers. In this line of thought, we propose a generalized zero-shot quantization (GZSQ) framework that neither requires original data nor relies on BN layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDepthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · 1x1 Convolution · Inverted Residual Block · Convolution · Average Pooling · Batch Normalization