Active Long Term Memory Networks
Tommaso Furlanello, Jiaping Zhao, Andrew M. Saxe, Laurent Itti, Bosco, S. Tjan

TL;DR
This paper introduces Active Long Term Memory Networks (A-LTM), a continual learning model that preserves prior knowledge while acquiring new information by actively maintaining inactive tasks using distillation, addressing catastrophic inference.
Contribution
A-LTM leverages the non-convex nature of neural networks and distillation loss to prevent forgetting, offering a novel approach to multi-task continual learning and domain adaptation.
Findings
A-LTM reduces catastrophic inference in deep linear networks.
A-LTM maintains viewpoint recognition across diverse datasets.
Empirical results show improved knowledge retention during domain shifts.
Abstract
Continual Learning in artificial neural networks suffers from interference and forgetting when different tasks are learned sequentially. This paper introduces the Active Long Term Memory Networks (A-LTM), a model of sequential multi-task deep learning that is able to maintain previously learned association between sensory input and behavioral output while acquiring knew knowledge. A-LTM exploits the non-convex nature of deep neural networks and actively maintains knowledge of previously learned, inactive tasks using a distillation loss. Distortions of the learned input-output map are penalized but hidden layers are free to transverse towards new local optima that are more favorable for the multi-task objective. We re-frame the McClelland's seminal Hippocampal theory with respect to Catastrophic Inference (CI) behavior exhibited by modern deep architectures trained with back-propagation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Advanced Memory and Neural Computing · Domain Adaptation and Few-Shot Learning
