SPIQ: Data-Free Per-Channel Static Input Quantization

Edouard Yvinec; Arnaud Dapogny; Matthieu Cord; Kevin Bailly

arXiv:2203.14642·cs.CV·March 29, 2022·1 cites

SPIQ: Data-Free Per-Channel Static Input Quantization

Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

PDF

Open Access

TL;DR

SPIQ introduces a data-free, per-channel static input quantization method that achieves accuracy comparable to dynamic methods while maintaining static inference speed, improving efficiency in neural network deployment.

Contribution

The paper presents SPIQ, a novel static input quantization scheme that rivals dynamic methods in accuracy without incurring additional inference costs.

Findings

01

Achieves accuracy comparable to dynamic quantization methods.

02

Significantly outperforms existing static quantization techniques.

03

Effective across multiple computer vision benchmarks.

Abstract

Computationally expensive neural networks are ubiquitous in computer vision and solutions for efficient inference have drawn a growing attention in the machine learning community. Examples of such solutions comprise quantization, i.e. converting the processing values (weights and inputs) from floating point into integers e.g. int8 or int4. Concurrently, the rise of privacy concerns motivated the study of less invasive acceleration methods, such as data-free quantization of pre-trained models weights and activations. Previous approaches either exploit statistical information to deduce scalar ranges and scaling factors for the activations in a static manner, or dynamically adapt this range on-the-fly for each input of each layers (also referred to as activations): the latter generally being more accurate at the expanse of significantly slower inference. In this work, we argue that static…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Neural Networks and Applications