The Generalization-Stability Tradeoff In Neural Network Pruning
Brian R. Bartoldson, Ari S. Morcos, Adrian Barbu, Gordon Erlebacher

TL;DR
This paper investigates how neural network pruning affects generalization, revealing a tradeoff where increased instability from pruning can improve test accuracy by acting as a regularizer, similar to noise injection.
Contribution
It introduces the concept of the generalization-stability tradeoff in pruning, showing how pruning's instability enhances generalization and explaining this through a regularization mechanism akin to noise injection.
Findings
Pruning's benefit to generalization increases with its instability.
Less stable pruning leads to flatter models and better generalization.
Pruning benefits are independent of permanent parameter removal.
Abstract
Pruning neural network parameters is often viewed as a means to compress models, but pruning has also been motivated by the desire to prevent overfitting. This motivation is particularly relevant given the perhaps surprising observation that a wide variety of pruning approaches increase test accuracy despite sometimes massive reductions in parameter counts. To better understand this phenomenon, we analyze the behavior of pruning over the course of training, finding that pruning's benefit to generalization increases with pruning's instability (defined as the drop in test accuracy immediately following pruning). We demonstrate that this "generalization-stability tradeoff" is present across a wide variety of pruning settings and propose a mechanism for its cause: pruning regularizes similarly to noise injection. Supporting this, we find less pruning stability leads to more model flatness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Reinforcement Learning in Robotics
MethodsPruning
