BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
Souvik Kundu, Shikai Wang, Qirui Sun, Peter A. Beerel, Massoud Pedram

TL;DR
BMPQ is a novel training method for mixed-precision quantization of deep neural networks that uses bit-gradient sensitivity analysis to efficiently produce highly compressed models without pre-training or extensive computation.
Contribution
The paper introduces BMPQ, a single-iteration, pre-training-free mixed-precision quantization method guided by bit-gradient analysis and ILP optimization, advancing model compression techniques.
Findings
BMPQ achieves 15.4x fewer parameter bits with negligible accuracy loss.
Models trained with BMPQ are 2-3x smaller than state-of-the-art during-training schemes.
BMPQ improves accuracy by up to 14.54% on benchmark datasets.
Abstract
Large DNNs with mixed-precision quantization can achieve ultra-high compression while retaining high classification performance. However, because of the challenges in finding an accurate metric that can guide the optimization process, these methods either sacrifice significant performance compared to the 32-bit floating-point (FP-32) baseline or rely on a compute-expensive, iterative training policy that requires the availability of a pre-trained baseline. To address this issue, this paper presents BMPQ, a training method that uses bit gradients to analyze layer sensitivities and yield mixed-precision quantized models. BMPQ requires a single training iteration but does not need a pre-trained baseline. It uses an integer linear program (ILP) to dynamically adjust the precision of layers during training, subject to a fixed hardware budget. To evaluate the efficacy of BMPQ, we conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Imaging Techniques and Applications
