Sparse Distributed Memory is a Continual Learner
Trenton Bricken, Xander Davies, Deepak Singh, Dmitry Krotov, Gabriel, Kreiman

TL;DR
This paper introduces a biologically inspired Sparse Distributed Memory-based MLP that excels at continual learning without memory replay or task info, offering new training methods for sparse networks.
Contribution
The paper presents a novel biologically inspired MLP variant using SDM that achieves effective continual learning without replay or task labels, with broad applicability.
Findings
Component-wise necessity for continual learning
No reliance on memory replay or task info
Novel training methods for sparse networks
Abstract
Continual learning is a problem for artificial neural networks that their biological counterparts are adept at solving. Building on work using Sparse Distributed Memory (SDM) to connect a core neural circuit with the powerful Transformer model, we create a modified Multi-Layered Perceptron (MLP) that is a strong continual learner. We find that every component of our MLP variant translated from biology is necessary for continual learning. Our solution is also free from any memory replay or task information, and introduces novel methods to train sparse networks that may be broadly applicable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding · Residual Connection · Dropout · Layer Normalization · Dense Connections
