Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks
Yuhang Li, Xin Dong, Wei Wang

TL;DR
The paper introduces APoT quantization, a non-uniform scheme for neural network weights and activations that improves efficiency and accuracy, outperforming state-of-the-art methods and nearing full-precision performance.
Contribution
It proposes a novel APoT quantization method using sum of Powers-of-Two levels, with improved gradient and weight normalization for stable training.
Findings
4-bit ResNet-50 achieves 76.6% top-1 accuracy on ImageNet
Reduces 22% computational cost compared to uniform quantization
Outperforms existing quantization methods
Abstract
We propose Additive Powers-of-Two~(APoT) quantization, an efficient non-uniform quantization scheme for the bell-shaped and long-tailed distribution of weights and activations in neural networks. By constraining all quantization levels as the sum of Powers-of-Two terms, APoT quantization enjoys high computational efficiency and a good match with the distribution of weights. A simple reparameterization of the clipping function is applied to generate a better-defined gradient for learning the clipping threshold. Moreover, weight normalization is presented to refine the distribution of weights to make the training more stable and consistent. Experimental results show that our proposed method outperforms state-of-the-art methods, and is even competitive with the full-precision models, demonstrating the effectiveness of our proposed APoT quantization. For example, our 4-bit quantized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Medical Image Segmentation Techniques · Advanced Neural Network Applications
MethodsWeight Normalization
