Exploring the Regularity of Sparse Structure in Convolutional Neural Networks
Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang,, William J. Dally

TL;DR
This paper investigates how the structure of sparsity in CNNs affects hardware efficiency and accuracy, showing that coarse-grained pruning maintains accuracy and improves compression and memory efficiency over fine-grained sparsity.
Contribution
It provides a quantitative analysis of the trade-offs between sparsity regularity and prediction accuracy, demonstrating the benefits of coarse-grained pruning for hardware acceleration.
Findings
Coarse-grained pruning achieves similar sparsity ratios as unstructured pruning without accuracy loss.
Coarse-grained sparsity results in better compression ratios due to index savings.
Coarse-grained sparsity reduces memory references by about 2x compared to fine-grained sparsity.
Abstract
Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning creates regular sparsity patterns, making it more amenable for hardware acceleration but more challenging to maintain the same accuracy. In this paper we quantitatively measure the trade-off between sparsity regularity and prediction accuracy, providing insights in how to maintain accuracy while having more a more structured sparsity pattern. Our experimental results show that coarse-grained pruning can achieve a sparsity ratio similar to unstructured pruning without loss of accuracy. Moreover, due to the index saving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
MethodsPruning
