Robustness-aware 2-bit quantization with real-time performance for neural network
Xiaobin Li, Hongxu Jiang, Shuangxi Huang, Fangzheng Tian

TL;DR
This paper introduces a robustness-aware 2-bit neural network quantization method that enhances accuracy and robustness using structural and spectral norm-based loss functions, enabling real-time performance on datasets like CIFAR-10 and ImageNet.
Contribution
It proposes a novel 2-bit quantization scheme based on binary neural networks and GANs, incorporating structural and robustness-aware loss functions for improved performance and adversarial robustness.
Findings
Outperforms state-of-the-art 2-bit quantization methods on CIFAR-10 and ImageNet.
Achieves robustness against FGSM adversarial attacks.
Enables real-time neural network inference with reduced precision.
Abstract
Quantized neural network (NN) with a reduced bit precision is an effective solution to reduces the computational and memory resource requirements and plays a vital role in machine learning. However, it is still challenging to avoid the significant accuracy degradation due to its numerical approximation and lower redundancy. In this paper, a novel robustness-aware 2-bit quantization scheme is proposed for NN base on binary NN and generative adversarial network(GAN), witch improves the performance by enriching the information of binary NN, efficiently extract the structural information and considering the robustness of the quantized NN. Specifically, using shift addition operation to replace the multiply-accumulate in the quantization process witch can effectively speed the NN. Meanwhile, a structural loss between the original NN and quantized NN is proposed to such that the structural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsAverage Pooling · Global Average Pooling · 1x1 Convolution · Dropout · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Fire Module · Max Pooling · Softmax · Xavier Initialization
