Griffin: Rethinking Sparse Optimization for Deep Learning Architectures

Jong Hoon Shin; Ali Shafiee; Ardavan Pedram; Hamzah Abdel-Aziz; Ling; Li; and Joseph Hassoun

arXiv:2107.12922·cs.AR·November 3, 2021

Griffin: Rethinking Sparse Optimization for Deep Learning Architectures

Jong Hoon Shin, Ali Shafiee, Ardavan Pedram, Hamzah Abdel-Aziz, Ling, Li, and Joseph Hassoun

PDF

Open Access

TL;DR

Griffin is a hybrid architecture that optimizes sparse and dense deep neural network computations, achieving significant power efficiency improvements over prior sparse architectures by effectively managing sparsity overheads.

Contribution

The paper introduces Griffin, a hybrid architecture that rethinks sparse optimization for deep learning, reducing overheads and improving power efficiency across various sparsity configurations.

Findings

01

Griffin achieves up to 3.1x power efficiency over state-of-the-art sparse architectures.

02

Supporting dual sparsity incurs 20-30% power efficiency loss on single sparse models.

03

Resource reuse in Griffin maintains high performance for both sparse and dense models.

Abstract

This paper examines the design space trade-offs of DNNs accelerators aiming to achieve competitive performance and efficiency metrics for all four combinations of dense or sparse activation/weight tensors. To do so, we systematically examine the overheads of supporting sparsity on top of an optimized dense core. These overheads are modeled based on parameters that indicate how a multiplier can borrow a nonzero operation from the neighboring multipliers or future cycles. As a result of this exploration, we identify a few promising designs that perform better than prior work. Our findings suggest that even the best design targeting dual sparsity yields a 20%-30% drop in power efficiency when performing on single sparse models, i.e., those with only sparse weight or sparse activation tensors. We found that one can reuse resources of the same core to maintain high performance and efficiency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Ferroelectric and Negative Capacitance Devices