Pruning Neural Networks at Initialization: Why are We Missing the Mark?

Jonathan Frankle; Gintare Karolina Dziugaite; Daniel M. Roy; Michael; Carbin

arXiv:2009.08576·cs.LG·March 23, 2021·48 cites

Pruning Neural Networks at Initialization: Why are We Missing the Mark?

Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael, Carbin

PDF

Open Access 2 Videos

TL;DR

This paper critically evaluates neural network pruning methods at initialization, revealing that their decisions can be simplified to per-layer pruning ratios and highlighting fundamental challenges in pruning heuristics at initialization.

Contribution

The study demonstrates that pruning decisions at initialization can be replaced by simple per-layer ratios and identifies inherent limitations in current pruning heuristics at initialization.

Findings

01

Randomly shuffling pruned weights preserves accuracy.

02

Pruning decisions can be reduced to per-layer ratios.

03

Current heuristics face fundamental challenges.

Abstract

Recent work has explored the possibility of pruning neural networks at initialization. We assess proposals for doing so: SNIP (Lee et al., 2019), GraSP (Wang et al., 2020), SynFlow (Tanaka et al., 2020), and magnitude pruning. Although these methods surpass the trivial baseline of random pruning, they remain below the accuracy of magnitude pruning after training, and we endeavor to understand why. We show that, unlike pruning after training, randomly shuffling the weights these methods prune within each layer or sampling new initial values preserves or improves accuracy. As such, the per-weight pruning decisions made by these methods can be replaced by a per-layer choice of the fraction of weights to prune. This property suggests broader challenges with the underlying pruning heuristics, the desire to prune at initialization, or both.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

#59 JEFF HAWKINS - Thousand Brains Theory· youtube

Pruning Neural Networks at Initialization: Why Are We Missing the Mark?· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsPruning · SNIP