Disentangling and Mitigating the Impact of Task Similarity for Continual Learning
Naoki Hiratani

TL;DR
This paper analyzes how task similarity affects continual learning in neural networks, revealing conditions that cause interference and proposing strategies like weight regularization to improve knowledge retention.
Contribution
It introduces an analytical model to understand the effects of task similarity and evaluates various gating and regularization methods for mitigating forgetting.
Findings
High input feature similarity with low readout similarity causes catastrophic forgetting.
Task-dependent activity gating improves retention but reduces transfer.
Fisher information-based regularization enhances retention across task similarities.
Abstract
Continual learning of partially similar tasks poses a challenge for artificial neural networks, as task similarity presents both an opportunity for knowledge transfer and a risk of interference and catastrophic forgetting. However, it remains unclear how task similarity in input features and readout patterns influences knowledge transfer and forgetting, as well as how they interact with common algorithms for continual learning. Here, we develop a linear teacher-student model with latent structure and show analytically that high input feature similarity coupled with low readout similarity is catastrophic for both knowledge transfer and retention. Conversely, the opposite scenario is relatively benign. Our analysis further reveals that task-dependent activity gating improves knowledge retention at the expense of transfer, while task-dependent plasticity gating does not affect either…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
