A Low-Power Accelerator for Deep Neural Networks with Enlarged Near-Zero Sparsity
Yuxiang Huan, Yifan Qin, Yantian You, Lirong Zheng, and Zhuo Zou

TL;DR
This paper introduces a low-power DNN accelerator that leverages near-zero sparsity to skip multiplications, significantly reducing energy consumption and increasing speed on embedded devices.
Contribution
It proposes a Near-Zero Approximation Unit (NZAU) and a grouping architecture to efficiently skip near-zero multiplications, achieving substantial power savings.
Findings
Achieves 1.92X and 1.51X reduction in multiplications for LeNet-5 and AlexNet.
Operates over 4X faster than Tegra K1 in processing fully-connected layers.
Consumes 717X less energy than the mobile GPU.
Abstract
It remains a challenge to run Deep Learning in devices with stringent power budget in the Internet-of-Things. This paper presents a low-power accelerator for processing Deep Neural Networks in the embedded devices. The power reduction is realized by avoiding multiplications of near-zero valued data. The near-zero approximation and a dedicated Near-Zero Approximation Unit (NZAU) are proposed to predict and skip the near-zero multiplications under certain thresholds. Compared with skipping zero-valued computations, our design achieves 1.92X and 1.51X further reduction of the total multiplications in LeNet-5 and Alexnet respectively, with negligible lose of accuracy. In the proposed accelerator, 256 multipliers are grouped into 16 independent Processing Lanes (PL) to support up to 16 neuron activations simultaneously. With the help of data pre-processing and buffering in each PL,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Detection and Scintillator Technologies · CCD and CMOS Imaging Sensors · Quantum-Dot Cellular Automata
