Keep Moving: identifying task-relevant subspaces to maximise plasticity for newly learned tasks
Daniel Anthes, Sushrut Thorat, Peter K\"onig, Tim C., Kietzmann

TL;DR
This paper introduces a method to decompose neural network activation spaces into task-relevant and irrelevant subspaces, revealing how changes affect stability and plasticity in continual learning.
Contribution
It proposes a novel subspace decomposition technique to distinguish changes impacting prior tasks from those that do not, enhancing understanding of stability-plasticity trade-offs.
Findings
Not all activation changes cause forgetting.
Regularisation techniques do not fully separate task-relevant and irrelevant subspaces.
Manipulating subspaces in a linear model shows causal links to stability and plasticity.
Abstract
Continual learning algorithms strive to acquire new knowledge while preserving prior information. Often, these algorithms emphasise stability and restrict network updates upon learning new tasks. In many cases, such restrictions come at a cost to the model's plasticity, i.e. the model's ability to adapt to the requirements of a new task. But is all change detrimental? Here, we approach this question by proposing that activation spaces in neural networks can be decomposed into two subspaces: a readout range in which change affects prior tasks and a null space in which change does not alter prior performance. Based on experiments with this novel technique, we show that, indeed, not all activation change is associated with forgetting. Instead, only change in the subspace visible to the readout of a task can lead to decreased stability, while restricting change outside of this subspace is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Cognitive Science and Mapping · Higher Education Learning Practices
MethodsElastic Weight Consolidation
