Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks
Kazuki Irie, J\"urgen Schmidhuber

TL;DR
This paper explores an alternative short-term memory mechanism in neural networks using dynamic, context-sensitive weight matrices, drawing connections to biological neural processes and recent advances like Transformers.
Contribution
It presents a detailed analysis of Fast Weight Programmers as an alternative to traditional neuron activation memory, emphasizing their biological plausibility and relation to modern sequence models.
Findings
FWPs achieve competitive performance on sequence tasks
FWPs are closely related to Transformers in structure and function
The paper highlights biological inspiration behind dynamic weight control
Abstract
Short-term memory in standard, general-purpose, sequence-processing recurrent neural networks (RNNs) is stored as activations of nodes or "neurons." Generalising feedforward NNs to such RNNs is mathematically straightforward and natural, and even historical: already in 1943, McCulloch and Pitts proposed this as a surrogate to "synaptic modifications" (in effect, generalising the Lenz-Ising model, the first non-sequence processing RNN architecture of the 1920s). A lesser known alternative approach to storing short-term memory in "synaptic connections" -- by parameterising and controlling the dynamics of a context-sensitive time-varying weight matrix through another NN -- yields another "natural" type of short-term memory in sequence processing NNs: the Fast Weight Programmers (FWPs) of the early 1990s. FWPs have seen a recent revival as generic sequence processors, achieving competitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Neural Networks and Reservoir Computing
