Minimizing Area and Energy of Deep Learning Hardware Design Using   Collective Low Precision and Structured Compression

Shihui Yin; Gaurav Srivastava; Shreyas K. Venkataramanaiah; Chaitali; Chakrabarti; Visar Berisha; Jae-sun Seo

arXiv:1804.07370·cs.NE·April 23, 2018·1 cites

Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Shihui Yin, Gaurav Srivastava, Shreyas K. Venkataramanaiah, Chaitali, Chakrabarti, Visar Berisha, Jae-sun Seo

PDF

Open Access

TL;DR

This paper introduces combined low-precision and structured sparsity techniques during training to minimize the area and energy consumption of deep neural network hardware with minimal accuracy loss.

Contribution

It presents a novel approach integrating structured sparsity and low-precision weights during training for optimized DNN hardware design.

Findings

01

50X weight memory reduction with CIFAR-10 maintaining accuracy

02

98.4% accuracy at 20nJ per classification on MNIST

03

Effective combination of sparsity and low-precision for hardware efficiency

Abstract

Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement them on power/area-constrained embedded platforms. To reduce the network size, several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. In addition, many recent works have focused on reducing precision of activations and weights with some reducing down to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. In this work, we present design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. During training, both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing

MethodsPruning