The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training
Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin, Mocanu, Zhangyang Wang, Mykola Pechenizkiy

TL;DR
This paper demonstrates that random pruning at initialization can effectively enable sparse training of neural networks, matching or surpassing dense models in performance, especially as network size increases, without complex pruning criteria.
Contribution
It reveals the surprisingly strong effectiveness of naive random pruning for sparse training, highlighting the importance of network size and layer-wise sparsity ratios.
Findings
Random pruning can match dense network performance in sparse training.
Larger and deeper networks benefit more from random pruning.
Randomly pruned networks outperform dense ones in robustness and uncertainty estimation.
Abstract
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks. Without any delicate pruning criteria or carefully pursued sparsity structures, we empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. There are two key factors that contribute to this revival: (i) the network sizes matter: as the original dense networks grow wider and deeper, the performance of training a randomly pruned sparse network will quickly grow to matching that of its dense equivalent, even at high sparsity ratios; (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsPruning
