Memory Augmented Neural Networks with Wormhole Connections

Caglar Gulcehre; Sarath Chandar; Yoshua Bengio

arXiv:1701.08718·cs.LG·January 31, 2017·44 cites

Memory Augmented Neural Networks with Wormhole Connections

Caglar Gulcehre, Sarath Chandar, Yoshua Bengio

PDF

Open Access

TL;DR

The paper introduces TARDIS, a memory-augmented neural network with wormhole connections that improves gradient flow and learning of long-term dependencies, outperforming traditional RNNs on various tasks.

Contribution

It proposes TARDIS, a novel, efficient memory-augmented neural network with simplified read/write operations and wormhole connections for better long-term dependency learning.

Findings

01

TARDIS effectively reduces vanishing gradients on long sequences.

02

It achieves competitive results on long-term dependency tasks.

03

Memory operations are simpler and more efficient than previous models.

Abstract

Recent empirical results on long-term dependency tasks have shown that neural networks augmented with an external memory can learn the long-term dependency tasks more easily and achieve better generalization than vanilla recurrent neural networks (RNN). We suggest that memory augmented neural networks can reduce the effects of vanishing gradients by creating shortcut (or wormhole) connections. Based on this observation, we propose a novel memory augmented neural network model called TARDIS (Temporal Automatic Relation Discovery in Sequences). The controller of TARDIS can store a selective set of embeddings of its own previous hidden states into an external memory and revisit them as and when needed. For TARDIS, memory acts as a storage for wormhole connections to the past to propagate the gradients more effectively and it helps to learn the temporal dependencies. The memory structure of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications