Why is Pruning at Initialization Immune to Reinitializing and Shuffling?
Sahib Singh, Rosanne Liu

TL;DR
This paper investigates why pruning at initialization remains effective even after reinitializing or shuffling weights, revealing that unpruned weight distributions are minimally affected by such randomizations.
Contribution
The study provides an analysis of layer-wise weight distributions, explaining the robustness of pruning-at-initialization methods against reinitialization and shuffling.
Findings
Unpruned weight distributions change minimally after randomization.
Pruning methods are robust to weight reinitialization and shuffling.
Layer-wise statistics explain the immunity of pruning at initialization.
Abstract
Recent studies assessing the efficacy of pruning neural networks methods uncovered a surprising finding: when conducting ablation studies on existing pruning-at-initialization methods, namely SNIP, GraSP, SynFlow, and magnitude pruning, performances of these methods remain unchanged and sometimes even improve when randomly shuffling the mask positions within each layer (Layerwise Shuffling) or sampling new initial weight values (Reinit), while keeping pruning masks the same. We attempt to understand the reason behind such network immunity towards weight/mask modifications, by studying layer-wise statistics before and after randomization operations. We found that under each of the pruning-at-initialization methods, the distribution of unpruned weights changed minimally with randomization operations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
MethodsPruning · SNIP
