DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks
Hassan Dbouk, Hetul Sanghvi, Mahesh Mehendale, Naresh Shanbhag

TL;DR
This paper introduces DBQ, a fully differentiable non-uniform quantizer designed for lightweight neural networks, enabling efficient deployment on resource-constrained devices with minimal accuracy loss.
Contribution
The paper presents a novel differentiable quantization method that effectively quantizes lightweight architectures like MobileNet, outperforming existing techniques in accuracy and efficiency.
Findings
DBQ achieves state-of-the-art accuracy on CIFAR-10, ImageNet, and Visual Wake Words datasets.
DBQ enables aggressive quantization of MobileNet and ShuffleNetV2 architectures.
The method offers a Pareto-optimal balance between accuracy and complexity.
Abstract
Deep neural networks have achieved state-of-the art performance on various computer vision tasks. However, their deployment on resource-constrained devices has been hindered due to their high computational and storage complexity. While various complexity reduction techniques, such as lightweight network architecture design and parameter quantization, have been successful in reducing the cost of implementing these networks, these methods have often been considered orthogonal. In reality, existing quantization techniques fail to replicate their success on lightweight architectures such as MobileNet. To this end, we present a novel fully differentiable non-uniform quantizer that can be seamlessly mapped onto efficient ternary-based dot product engines. We conduct comprehensive experiments on CIFAR-10, ImageNet, and Visual Wake Words datasets. The proposed quantizer (DBQ) successfully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Global Average Pooling · Inverted Residual Block · Pointwise Convolution · Dense Connections · Depthwise Convolution · Batch Normalization · Depthwise Separable Convolution · Softmax · MobileNetV1
