Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra, Peste

TL;DR
This survey comprehensively reviews sparsity techniques in deep learning, covering pruning and growth methods for efficient inference and training, and provides practical guidance and open problems for future research.
Contribution
It offers an extensive tutorial on sparsification methods, distills insights from over 300 papers, and introduces a metric for pruned parameter efficiency.
Findings
Sparse networks can match or outperform dense networks in generalization.
Sparsity reduces memory footprint and training time for large models.
Practical acceleration techniques for sparse models on hardware are discussed.
Abstract
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
