DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural   Networks

Hassan Dbouk; Hetul Sanghvi; Mahesh Mehendale; Naresh Shanbhag

arXiv:2007.09818·cs.CV·July 21, 2020

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Hassan Dbouk, Hetul Sanghvi, Mahesh Mehendale, Naresh Shanbhag

PDF

TL;DR

This paper introduces DBQ, a fully differentiable non-uniform quantizer designed for lightweight neural networks, enabling efficient deployment on resource-constrained devices with minimal accuracy loss.

Contribution

The paper presents a novel differentiable quantization method that effectively quantizes lightweight architectures like MobileNet, outperforming existing techniques in accuracy and efficiency.

Findings

01

DBQ achieves state-of-the-art accuracy on CIFAR-10, ImageNet, and Visual Wake Words datasets.

02

DBQ enables aggressive quantization of MobileNet and ShuffleNetV2 architectures.

03

The method offers a Pareto-optimal balance between accuracy and complexity.

Abstract

Deep neural networks have achieved state-of-the art performance on various computer vision tasks. However, their deployment on resource-constrained devices has been hindered due to their high computational and storage complexity. While various complexity reduction techniques, such as lightweight network architecture design and parameter quantization, have been successful in reducing the cost of implementing these networks, these methods have often been considered orthogonal. In reality, existing quantization techniques fail to replicate their success on lightweight architectures such as MobileNet. To this end, we present a novel fully differentiable non-uniform quantizer that can be seamlessly mapped onto efficient ternary-based dot product engines. We conduct comprehensive experiments on CIFAR-10, ImageNet, and Visual Wake Words datasets. The proposed quantizer (DBQ) successfully…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Global Average Pooling · Inverted Residual Block · Pointwise Convolution · Dense Connections · Depthwise Convolution · Batch Normalization · Depthwise Separable Convolution · Softmax · MobileNetV1