Learning to Remember Rare Events

{\L}ukasz Kaiser; Ofir Nachum; Aurko Roy; Samy Bengio

arXiv:1703.03129·cs.LG·March 10, 2017·237 cites

Learning to Remember Rare Events

{\L}ukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio

PDF

Open Access 2 Repos

TL;DR

This paper introduces a scalable, differentiable memory module for deep neural networks that enables effective life-long and one-shot learning of rare events across various architectures and tasks.

Contribution

It presents a large-scale, end-to-end trainable memory module that enhances neural networks' ability to remember rare events over long periods without resets.

Findings

01

Achieved state-of-the-art one-shot learning on Omniglot dataset.

02

Enabled life-long one-shot learning in recurrent neural networks for machine translation.

03

Demonstrated versatility by integrating with different neural network architectures.

Abstract

Despite recent advances, memory-augmented deep neural networks are still limited when it comes to life-long and one-shot learning, especially in remembering rare events. We present a large-scale life-long memory module for use in deep learning. The module exploits fast nearest-neighbor algorithms for efficiency and thus scales to large memory sizes. Except for the nearest-neighbor query, the module is fully differentiable and trained end-to-end with no extra supervision. It operates in a life-long manner, i.e., without the need to reset it during training. Our memory module can be easily added to any part of a supervised neural network. To show its versatility we add it to a number of networks, from simple convolutional ones tested on image classification to deep sequence-to-sequence and recurrent-convolutional models. In all cases, the enhanced network gains the ability to remember…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications