Pruning at Initialization -- A Sketching Perspective
Noga Bar, Raja Giryes

TL;DR
This paper connects pruning at initialization in neural networks to sketching in matrix approximation, providing theoretical insights and improvements for data-independent pruning methods.
Contribution
It introduces a sketching perspective to analyze and improve pruning at initialization, with theoretical bounds and algorithmic enhancements.
Findings
Bounded the approximation error of pruned models using sketching techniques
Provided theoretical justification for data-independent pruning strategies
Suggested a generic improvement to existing pruning algorithms
Abstract
The lottery ticket hypothesis (LTH) has increased attention to pruning neural networks at initialization. We study this problem in the linear setting. We show that finding a sparse mask at initialization is equivalent to the sketching problem introduced for efficient matrix multiplication. This gives us tools to analyze the LTH problem and gain insights into it. Specifically, using the mask found at initialization, we bound the approximation error of the pruned linear model at the end of training. We theoretically justify previous empirical evidence that the search for sparse networks may be data independent. By using the sketching perspective, we suggest a generic improvement to existing algorithms for pruning at initialization, which we show to be beneficial in the data-independent case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Model Reduction and Neural Networks
MethodsPruning
