The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Cameron Shinn, Collin McCarthy, Saurav Muralidharan, Muhammad Osama,, John D. Owens

TL;DR
The paper introduces the Sparsity Roofline, a visual performance model that predicts the hardware performance limits of sparse neural networks by jointly modeling accuracy, sparsity, and speedup without needing optimized kernels.
Contribution
It presents a novel analytical model for predicting sparse network performance and validates it across various architectures and sparsity patterns, aiding researchers and hardware designers.
Findings
The model accurately predicts speedup for different sparsity patterns.
It helps identify sparsity regimes with the highest performance potential.
The approach does not require implementing optimized kernels.
Abstract
We introduce the Sparsity Roofline, a visual performance model for evaluating sparsity in neural networks. The Sparsity Roofline jointly models network accuracy, sparsity, and theoretical inference speedup. Our approach does not require implementing and benchmarking optimized kernels, and the theoretical speedup becomes equal to the actual speedup when the corresponding dense and sparse kernels are well-optimized. We achieve this through a novel analytical model for predicting sparse network performance, and validate the predicted speedup using several real-world computer vision architectures pruned across a range of sparsity patterns and degrees. We demonstrate the utility and ease-of-use of our model through two case studies: (1) we show how machine learning researchers can predict the performance of unimplemented or unoptimized block-structured sparsity patterns, and (2) we show how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Neural Network Applications · Advanced Memory and Neural Computing
