The Propensity for Density in Feed-forward Models

Nandi Schoots; Alex Jackson; Ali Kholmovaia; Peter McBurney; Murray; Shanahan

arXiv:2410.14461·cs.LG·October 21, 2024

The Propensity for Density in Feed-forward Models

Nandi Schoots, Alex Jackson, Ali Kholmovaia, Peter McBurney, Murray, Shanahan

PDF

TL;DR

This paper investigates whether neural networks tend to utilize all their weights during training and finds that the proportion of prunable weights remains consistent across different model sizes, indicating a density propensity.

Contribution

It provides empirical evidence that the density of neural networks is largely invariant to model size and explores hypotheses explaining this phenomenon.

Findings

01

High prunability across various model sizes

02

Pruned proportion is invariant to model width

03

Model size increase does not significantly affect density

Abstract

Does the process of training a neural network to solve a task tend to use all of the available weights even when the task could be solved with fewer weights? To address this question we study the effects of pruning fully connected, convolutional and residual models while varying their widths. We find that the proportion of weights that can be pruned without degrading performance is largely invariant to model size. Increasing the width of a model has little effect on the density of the pruned model relative to the increase in absolute size of the pruned network. In particular, we find substantial prunability across a large range of model sizes, where our biggest model is 50 times as wide as our smallest model. We explore three hypotheses that could explain these findings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning