Sparsity in Deep Learning: Pruning and growth for efficient inference   and training in neural networks

Torsten Hoefler; Dan Alistarh; Tal Ben-Nun; Nikoli Dryden; Alexandra; Peste

arXiv:2102.00554·cs.LG·February 2, 2021

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra, Peste

PDF

TL;DR

This survey comprehensively reviews sparsity techniques in deep learning, covering pruning and growth methods for efficient inference and training, and provides practical guidance and open problems for future research.

Contribution

It offers an extensive tutorial on sparsification methods, distills insights from over 300 papers, and introduces a metric for pruned parameter efficiency.

Findings

01

Sparse networks can match or outperform dense networks in generalization.

02

Sparsity reduces memory footprint and training time for large models.

03

Practical acceleration techniques for sparse models on hardware are discussed.

Abstract

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning