The Unreasonable Effectiveness of Random Pruning: Return of the Most   Naive Baseline for Sparse Training

Shiwei Liu; Tianlong Chen; Xiaohan Chen; Li Shen; Decebal Constantin; Mocanu; Zhangyang Wang; Mykola Pechenizkiy

arXiv:2202.02643·cs.LG·February 8, 2022·33 cites

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin, Mocanu, Zhangyang Wang, Mykola Pechenizkiy

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that random pruning at initialization can effectively enable sparse training of neural networks, matching or surpassing dense models in performance, especially as network size increases, without complex pruning criteria.

Contribution

It reveals the surprisingly strong effectiveness of naive random pruning for sparse training, highlighting the importance of network size and layer-wise sparsity ratios.

Findings

01

Random pruning can match dense network performance in sparse training.

02

Larger and deeper networks benefit more from random pruning.

03

Randomly pruned networks outperform dense ones in robustness and uncertainty estimation.

Abstract

Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks. Without any delicate pruning criteria or carefully pursued sparsity structures, we empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. There are two key factors that contribute to this revival: (i) the network sizes matter: as the original dense networks grow wider and deeper, the performance of training a randomly pruned sparse network will quickly grow to matching that of its dense equivalent, even at high sparsity ratios; (ii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vita-group/random_pruning
pytorchOfficial

Videos

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsPruning