Bit-balance: Model-Hardware Co-design for Accelerating NNs by Exploiting   Bit-level Sparsity

Wenhao Sun; Zhiwei Zou; Deng Liu; Wendi Sun; Song Chen; and Yi Kang

arXiv:2302.00201·cs.AR·February 2, 2023

Bit-balance: Model-Hardware Co-design for Accelerating NNs by Exploiting Bit-level Sparsity

Wenhao Sun, Zhiwei Zou, Deng Liu, Wendi Sun, Song Chen, and Yi Kang

PDF

Open Access

TL;DR

This paper introduces Bit-balance, a co-designed model-hardware approach leveraging bit-level sparsity in neural networks to enhance resource and energy efficiency through balanced workloads and adaptive bitwidth computation.

Contribution

It proposes a novel bit-sparsity quantization method and a sparse bit-serial architecture for improved neural network acceleration.

Findings

01

Achieves 1.8x~2.7x energy efficiency over existing accelerators.

02

Supports multiple neural network architectures with high frame rates.

03

Maintains accuracy with minimal impact from bit sparsity constraints.

Abstract

Bit-serial architectures can handle Neural Networks (NNs) with different weight precisions, achieving higher resource efficiency compared with bit-parallel architectures. Besides, the weights contain abundant zero bits owing to the fault tolerance of NNs, indicating that bit sparsity of NNs can be further exploited for performance improvement. However, the irregular proportion of zero bits in each weight causes imbalanced workloads in the Processing Element (PE) array, which degrades performance or induces overhead for sparse processing. Thus, this paper proposed a bit-sparsity quantization method to maintain the bit sparsity ratio of each weight to no more than a certain value for balancing workloads, with little accuracy loss. Then, we co-designed a sparse bit-serial architecture, called Bit-balance, to improve overall performance, supporting weight-bit sparsity and adaptive bitwidth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Machine Learning and ELM