Two Sparsities Are Better Than One: Unlocking the Performance Benefits   of Sparse-Sparse Networks

Kevin Lee Hunter; Lawrence Spracklen; Subutai Ahmad

arXiv:2112.13896·cs.LG·December 30, 2021

Two Sparsities Are Better Than One: Unlocking the Performance Benefits of Sparse-Sparse Networks

Kevin Lee Hunter, Lawrence Spracklen, Subutai Ahmad

PDF

Open Access

TL;DR

This paper introduces Complementary Sparsity, a technique that leverages both weight and activation sparsity in neural networks to achieve up to 100X improvements in throughput and energy efficiency on FPGAs.

Contribution

The paper proposes a novel method called Complementary Sparsity that effectively combines weight and activation sparsity to enhance neural network performance on existing hardware.

Findings

01

Achieved up to 100X throughput and energy efficiency improvements.

02

Demonstrated scalability and resource tradeoffs for convolutional networks.

03

Showed potential for efficient scaling of future AI models.

Abstract

In principle, sparse neural networks should be significantly more efficient than traditional dense networks. Neurons in the brain exhibit two types of sparsity; they are sparsely interconnected and sparsely active. These two types of sparsity, called weight sparsity and activation sparsity, when combined, offer the potential to reduce the computational cost of neural networks by two orders of magnitude. Despite this potential, today's neural networks deliver only modest performance benefits using just weight sparsity, because traditional computing hardware cannot efficiently process sparse networks. In this article we introduce Complementary Sparsity, a novel technique that significantly improves the performance of dual sparse networks on existing hardware. We demonstrate that we can achieve high performance running weight-sparse networks, and we can multiply those speedups by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices

MethodsPointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Batch Normalization · Inverted Residual Block · Average Pooling · Convolution · 1x1 Convolution