On the Stability of Growth in Structural Plasticity
Lute Lillo, Nick Cheney

TL;DR
This paper investigates the challenges and dynamics of structural growth in neural networks during training, highlighting differences from pruning and emphasizing the importance of integration stability for growth methods.
Contribution
It isolates the insertion problem in growth, analyzes the weak gradient signals for new units, and evaluates how growth compares to pruning in various settings.
Findings
Newborn units often receive weaker gradient signals than existing units.
Growth can achieve high accuracy during editing, but pruning often yields better final performance.
Improving integration of new units can enhance adaptive performance, especially in continual learning.
Abstract
Standard deep-learning pipelines usually choose the network architecture before training and keep it fixed throughout optimization. In contrast, a model can also be adapted by editing its structure during training, for example by pruning existing hidden-neuron units or growing new ones. Although growth is appealing for adaptive and continual systems, we show that it is not simply the inverse of pruning. Pruning selects among units that have participated in training from the start, whereas growth inserts new units into an already specialized optimization trajectory. We isolate this insertion problem and show that newborn units are often forward-active but backward-starved: they participate in the forward computation, yet receive much weaker gradient signal than incumbent units. This disadvantage is minor in small MLP benchmarks, but becomes clear in harder image-classification settings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
