Pruning Before Training May Improve Generalization, Provably
Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang

TL;DR
Pruning neural networks before training can improve generalization if done below a certain threshold, but excessive pruning can lead to memorization without better generalization, highlighting the impact of pruning on learning dynamics.
Contribution
This work provides the first theoretical analysis of how pruning fractions affect neural network training and generalization, revealing both positive and negative effects.
Findings
Good generalization when pruning fraction is below threshold
Generalization bound improves with larger pruning fractions within limits
Excessive pruning leads to memorization without better generalization
Abstract
It has been observed in practice that applying pruning-at-initialization methods to neural networks and training the sparsified networks can not only retain the testing performance of the original dense models, but also sometimes even slightly boost the generalization performance. Theoretical understanding for such experimental observations are yet to be developed. This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization. Specifically, this work considers a classification task for overparameterized two-layer neural networks, where the network is randomly pruned according to different rates at the initialization. It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero and the network exhibits good generalization performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Advanced Neural Network Applications
MethodsPruning
