FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna, Sadiq M. Sait, Aiman El-Maleh, and Irfan Ahmad

TL;DR
FxP-QNet is a post-training quantization framework that designs mixed low-precision DNNs with dynamic fixed-point representation, optimizing accuracy and compression without retraining.
Contribution
It introduces a novel adaptive quantization method that balances accuracy and low-precision requirements through post-training self-distillation and error statistics.
Findings
Achieves significant memory reduction with minimal accuracy loss.
Effectively applies to AlexNet, VGG-16, ResNet-18 on ImageNet.
No retraining needed for quantization.
Abstract
Deep neural networks (DNNs) have demonstrated their effectiveness in a wide range of computer vision tasks, with the state-of-the-art results obtained through complex and deep structures that require intensive computation and memory. Now-a-days, efficient model inference is crucial for consumer applications on resource-constrained platforms. As a result, there is much interest in the research and development of dedicated deep learning (DL) hardware to improve the throughput and energy efficiency of DNNs. Low-precision representation of DNN data-structures through quantization would bring great benefits to specialized DL hardware. However, the rigorous quantization leads to a severe accuracy drop. As such, quantization opens a large hyper-parameter space at bit-precision levels, the exploration of which is a major challenge. In this paper, we propose a novel framework referred to as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
