A study on the plasticity of neural networks

Tudor Berariu; Wojciech Czarnecki; Soham De; Jorg Bornschein; Samuel; Smith; Razvan Pascanu; Claudia Clopath

arXiv:2106.00042·cs.LG·October 17, 2023

A study on the plasticity of neural networks

Tudor Berariu, Wojciech Czarnecki, Soham De, Jorg Bornschein, Samuel, Smith, Razvan Pascanu, Claudia Clopath

PDF

Open Access

TL;DR

This paper investigates how neural network plasticity is affected by prior training, revealing that pretraining can reduce the network's ability to generalize on new tasks, which impacts continual learning strategies.

Contribution

It provides a hypothesis explaining why pretrained models may lose plasticity and generalization ability, highlighting implications for continual learning.

Findings

01

Pretrained models may not reach the same generalization as freshly initialized ones.

02

Loss of plasticity affects the ability to learn new tasks effectively.

03

The paper discusses the mechanics behind plasticity reduction in neural networks.

Abstract

One aim shared by multiple settings, such as continual learning or transfer learning, is to leverage previously acquired knowledge to converge faster on the current task. Usually this is done through fine-tuning, where an implicit assumption is that the network maintains its plasticity, meaning that the performance it can reach on any given task is not affected negatively by previously seen tasks. It has been observed recently that a pretrained model on data from the same distribution as the one it is fine-tuned on might not reach the same generalisation as a freshly initialised one. We build and extend this observation, providing a hypothesis for the mechanics behind it. We discuss the implication of losing plasticity for continual learning which heavily relies on optimising pretrained models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks