How Weight Resampling and Optimizers Shape the Dynamics of Continual Learning and Forgetting in Neural Networks
Lapo Frati, Neil Traft, Jeff Clune, Nick Cheney

TL;DR
This paper investigates how weight resampling ('zapping') and optimizer choices influence learning and forgetting dynamics in neural networks during continual and transfer learning, revealing their roles in recovery speed and task interference.
Contribution
It provides a detailed analysis of how weight resampling and optimizers affect learning and forgetting patterns in neural networks under continual learning scenarios.
Findings
Models with zapping recover faster after domain transfer.
Optimizer choice significantly impacts learning and forgetting dynamics.
Complex task interference patterns emerge during sequential learning.
Abstract
Recent work in continual learning has highlighted the beneficial effect of resampling weights in the last layer of a neural network (``zapping"). Although empirical results demonstrate the effectiveness of this approach, the underlying mechanisms that drive these improvements remain unclear. In this work, we investigate in detail the pattern of learning and forgetting that take place inside a convolutional neural network when trained in challenging settings such as continual learning and few-shot transfer learning, with handwritten characters and natural images. Our experiments show that models that have undergone zapping during training more quickly recover from the shock of transferring to a new domain. Furthermore, to better observe the effect of continual learning in a multi-task setting we measure how each individual task is affected. This shows that, not only zapping, but the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Memory Processes and Influences · Visual Attention and Saliency Detection
