Designing strong baselines for ternary neural network quantization   through support and mass equalization

Edouard Yvinec; Arnaud Dapogny; Kevin Bailly

arXiv:2306.17442·cs.CV·July 3, 2023

Designing strong baselines for ternary neural network quantization through support and mass equalization

Edouard Yvinec, Arnaud Dapogny, Kevin Bailly

PDF

Open Access

TL;DR

This paper introduces novel operators, TQuant and MQuant, to improve ternary neural network quantization by addressing error minimization, leading to significant performance gains across various quantization scenarios.

Contribution

The paper proposes two new operators for ternary quantization that optimize error minimization strategies, advancing the state-of-the-art in neural network quantization.

Findings

01

Significant performance improvements in ternary quantization.

02

Effective across data-free, post-training, and quantization-aware training scenarios.

03

Provides insights for future neural network quantization research.

Abstract

Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision. These results rely on over-parameterized backbones, which are expensive to run. This computational burden can be dramatically reduced by quantizing (in either data-free (DFQ), post-training (PTQ) or quantization-aware training (QAT) scenarios) floating point values to ternary values (2 bits, with each weight taking value in {-1,0,1}). In this context, we observe that rounding to nearest minimizes the expected error given a uniform distribution and thus does not account for the skewness and kurtosis of the weight distribution, which strongly affects ternary quantization performance. This raises the following question: shall one minimize the highest or average quantization error? To answer this, we design two operators: TQuant and MQuant that correspond to these respective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning