The Propensity for Density in Feed-forward Models
Nandi Schoots, Alex Jackson, Ali Kholmovaia, Peter McBurney, Murray, Shanahan

TL;DR
This paper investigates whether neural networks tend to utilize all their weights during training and finds that the proportion of prunable weights remains consistent across different model sizes, indicating a density propensity.
Contribution
It provides empirical evidence that the density of neural networks is largely invariant to model size and explores hypotheses explaining this phenomenon.
Findings
High prunability across various model sizes
Pruned proportion is invariant to model width
Model size increase does not significantly affect density
Abstract
Does the process of training a neural network to solve a task tend to use all of the available weights even when the task could be solved with fewer weights? To address this question we study the effects of pruning fully connected, convolutional and residual models while varying their widths. We find that the proportion of weights that can be pruned without degrading performance is largely invariant to model size. Increasing the width of a model has little effect on the density of the pruned model relative to the increase in absolute size of the pruned network. In particular, we find substantial prunability across a large range of model sizes, where our biggest model is 50 times as wide as our smallest model. We explore three hypotheses that could explain these findings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
