Learning to Remember Rare Events
{\L}ukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio

TL;DR
This paper introduces a scalable, differentiable memory module for deep neural networks that enables effective life-long and one-shot learning of rare events across various architectures and tasks.
Contribution
It presents a large-scale, end-to-end trainable memory module that enhances neural networks' ability to remember rare events over long periods without resets.
Findings
Achieved state-of-the-art one-shot learning on Omniglot dataset.
Enabled life-long one-shot learning in recurrent neural networks for machine translation.
Demonstrated versatility by integrating with different neural network architectures.
Abstract
Despite recent advances, memory-augmented deep neural networks are still limited when it comes to life-long and one-shot learning, especially in remembering rare events. We present a large-scale life-long memory module for use in deep learning. The module exploits fast nearest-neighbor algorithms for efficiency and thus scales to large memory sizes. Except for the nearest-neighbor query, the module is fully differentiable and trained end-to-end with no extra supervision. It operates in a life-long manner, i.e., without the need to reset it during training. Our memory module can be easily added to any part of a supervised neural network. To show its versatility we add it to a number of networks, from simple convolutional ones tested on image classification to deep sequence-to-sequence and recurrent-convolutional models. In all cases, the enhanced network gains the ability to remember…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
