DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two   Quantization

Xinlin Li; Bang Liu; Rui Heng Yang; Vanessa Courville; Chao Xing,; Vahid Partovi Nia

arXiv:2208.09708·cs.CV·October 25, 2023

DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization

Xinlin Li, Bang Liu, Rui Heng Yang, Vanessa Courville, Chao Xing,, Vahid Partovi Nia

PDF

Open Access 2 Repos 1 Video

TL;DR

DenseShift introduces a novel low-bit, multiplication-free neural network that significantly improves accuracy and efficiency, enabling competitive performance with full-precision models on vision and speech tasks.

Contribution

The paper proposes DenseShift, a new low-bit Shift network with zero-free shifting, sign-scale decomposition, and improved training strategies, achieving higher accuracy and efficiency.

Findings

01

DenseShift outperforms existing low-bit Shift networks.

02

Achieves 1.6X speed-up with non-quantized activations.

03

Maintains competitive accuracy with full-precision models.

Abstract

Efficiently deploying deep neural networks on low-resource edge devices is challenging due to their ever-increasing resource requirements. To address this issue, researchers have proposed multiplication-free neural networks, such as Power-of-Two quantization, or also known as Shift networks, which aim to reduce memory usage and simplify computation. However, existing low-bit Shift networks are not as accurate as their full-precision counterparts, typically suffering from limited weight range encoding schemes and quantization loss. In this paper, we propose the DenseShift network, which significantly improves the accuracy of Shift networks, achieving competitive performance to full-precision networks for vision and speech applications. In addition, we introduce a method to deploy an efficient DenseShift network using non-quantized floating-point activations, while obtaining 1.6X speed-up…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM