Using Fast Weights to Attend to the Recent Past
Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin, Ionescu

TL;DR
This paper introduces fast weights in neural networks, enabling temporary memory of recent past inputs, which enhances sequence modeling without storing repeated activity patterns, aligning with biological plausibility.
Contribution
The paper proposes the use of fast weights as a new mechanism for temporal attention, expanding neural network capabilities to include dynamic, short-term memory.
Findings
Fast weights store recent past information efficiently.
Fast weights improve sequence-to-sequence model performance.
The approach aligns with biological neural dynamics.
Abstract
Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs. There is no good reason for this restriction. Synapses have dynamics at many different time-scales and this suggests that artificial neural networks might benefit from variables that change slower than activities but much faster than the standard weights. These "fast weights" can be used to store temporary memories of the recent past and they provide a neurally plausible way of implementing the type of attention to the past that has recently proved very helpful in sequence-to-sequence models. By using fast weights we can avoid the need to store copies of neural activity patterns.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
