Two Sparsities Are Better Than One: Unlocking the Performance Benefits of Sparse-Sparse Networks
Kevin Lee Hunter, Lawrence Spracklen, Subutai Ahmad

TL;DR
This paper introduces Complementary Sparsity, a technique that leverages both weight and activation sparsity in neural networks to achieve up to 100X improvements in throughput and energy efficiency on FPGAs.
Contribution
The paper proposes a novel method called Complementary Sparsity that effectively combines weight and activation sparsity to enhance neural network performance on existing hardware.
Findings
Achieved up to 100X throughput and energy efficiency improvements.
Demonstrated scalability and resource tradeoffs for convolutional networks.
Showed potential for efficient scaling of future AI models.
Abstract
In principle, sparse neural networks should be significantly more efficient than traditional dense networks. Neurons in the brain exhibit two types of sparsity; they are sparsely interconnected and sparsely active. These two types of sparsity, called weight sparsity and activation sparsity, when combined, offer the potential to reduce the computational cost of neural networks by two orders of magnitude. Despite this potential, today's neural networks deliver only modest performance benefits using just weight sparsity, because traditional computing hardware cannot efficiently process sparse networks. In this article we introduce Complementary Sparsity, a novel technique that significantly improves the performance of dual sparse networks on existing hardware. We demonstrate that we can achieve high performance running weight-sparse networks, and we can multiply those speedups by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
MethodsPointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Batch Normalization · Inverted Residual Block · Average Pooling · Convolution · 1x1 Convolution
