Energy-efficient Dense DNN Acceleration with Signed Bit-slice Architecture
Dongseok Im, Gwangtae Park, Zhiyong Li, Junha Ryu, and Hoi-Jun Yoo

TL;DR
This paper introduces a signed bit-slice architecture that efficiently accelerates high-precision dense DNNs on mobile SoCs by exploiting zero values and balancing data, achieving significant improvements in area, energy, and throughput.
Contribution
It proposes a novel signed bit-slice representation and architecture that accelerates dense DNNs by exploiting zero slices and output speculation, outperforming previous accelerators.
Findings
3.65x higher area-efficiency compared to Bit-fusion
3.88x higher energy-efficiency
5.35x higher throughput
Abstract
As the number of deep neural networks (DNNs) to be executed on a mobile system-on-chip (SoC) increases, the mobile SoC suffers from the real-time DNN acceleration within its limited hardware resources and power budget. Although the previous mobile neural processing units (NPUs) take advantage of low-bit computing and exploitation of the sparsity, it is incapable of accelerating high-precision and dense DNNs. This paper proposes energy-efficient signed bit-slice architecture which accelerates both high-precision and dense DNNs by exploiting a large number of zero values of signed bit-slices. Proposed signed bit-slice representation (SBR) changes signed bit-slice to by borrowing a value from its lower order of bit-slice. As a result, it generates a large number of zero bit-slices even in dense DNNs. Moreover, it balances the positive and negative values of 2's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques
