Class-based Quantization for Neural Networks

Wenhao Sun; Grace Li Zhang; Huaxi Gu; Bing Li; Ulf Schlichtmann

arXiv:2211.14928·cs.LG·November 29, 2022

Class-based Quantization for Neural Networks

Wenhao Sun, Grace Li Zhang, Huaxi Gu, Bing Li, Ulf Schlichtmann

PDF

Open Access

TL;DR

This paper introduces a class-based quantization approach for neural networks that assigns different bit-widths to filters or neurons based on their importance, effectively reducing model size and computation while maintaining accuracy.

Contribution

The proposed method evaluates filter importance relative to dataset classes and uses a search algorithm to assign optimal quantization bits per filter or neuron.

Findings

01

Maintains inference accuracy with low-bit quantization.

02

Outperforms existing methods at the same quantization levels.

03

Achieves better accuracy than prior approaches.

Abstract

In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC) operations. Accordingly, it is challenging to apply DNNs on resource-constrained platforms, e.g., mobile phones. Quantization is a method to reduce the size and the computational complexity of DNNs. Existing quantization methods either require hardware overhead to achieve a non-uniform quantization or focus on model-wise and layer-wise uniform quantization, which are not as fine-grained as filter-wise quantization. In this paper, we propose a class-based quantization method to determine the minimum number of quantization bits for each filter or neuron in DNNs individually. In the proposed method, the importance score of each filter or neuron with respect to the number of classes in the dataset is first evaluated. The larger the score is, the more important the filter or neuron is and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and ELM