Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node   Activation

Sebastian Lee; Stefano Sarao Mannelli; Claudia Clopath; Sebastian; Goldt; Andrew Saxe

arXiv:2205.09029·stat.ML·August 1, 2022·5 cites

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Sebastian Lee, Stefano Sarao Mannelli, Claudia Clopath, Sebastian, Goldt, Andrew Saxe

PDF

Open Access 1 Repo

TL;DR

This paper investigates catastrophic forgetting in neural networks, revealing a trade-off between node activation and re-use that explains why forgetting peaks at intermediate task similarities, and reinterprets existing methods accordingly.

Contribution

It introduces the Maslow's hammer hypothesis, explaining the non-monotonic forgetting pattern and analyzes the effectiveness of interventions based on this trade-off.

Findings

01

Forgetting peaks at intermediate task similarity regimes.

02

Trade-off between node activation and re-use explains forgetting behavior.

03

Reinterpretation of algorithms based on the trade-off.

Abstract

Continual learning - learning new tasks in sequence while maintaining performance on old tasks - remains particularly challenging for artificial neural networks. Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime. In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow's hammer hypothesis. Our analysis reveals the presence of a trade-off between node activation and node re-use that results in worst forgetting in the intermediate regime. Using this understanding we reinterpret popular algorithmic interventions for catastrophic interference in terms of this trade-off, and identify the regimes in which they are most effective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seblee97/student_teacher_catastrophic
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Explainable Artificial Intelligence (XAI)