On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks
Hongru Yang, Zhangyang Wang

TL;DR
This paper analyzes how random pruning affects neural tangent kernels (NTKs) in neural networks, showing equivalence between pruned and original networks' NTKs under certain conditions, with implications for understanding pruning's impact on network training.
Contribution
It establishes the equivalence of NTKs between fully-connected neural networks and their randomly pruned versions in both infinite and finite-width regimes, introducing new analysis techniques.
Findings
NTK of pruned networks converges to that of original networks with proper scaling.
Width requirements depend linearly on sparsity to maintain NTK closeness.
Results match known bounds for unpruned networks when pruning probability is zero.
Abstract
Motivated by both theory and practice, we study how random pruning of the weights affects a neural network's neural tangent kernel (NTK). In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version. The equivalence is established under two cases. The first main result studies the infinite-width asymptotic. It is shown that given a pruning probability, for fully-connected neural networks with the weights randomly pruned at the initialization, as the width of each layer grows to infinity sequentially, the NTK of the pruned neural network converges to the limiting NTK of the original network with some extra scaling. If the network weights are rescaled appropriately after pruning, this extra scaling can be removed. The second main result considers the finite-width case. It is shown that to ensure the NTK's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
MethodsPruning · Neural Tangent Kernel
