Learning to Transduce with Unbounded Memory

Edward Grefenstette; Karl Moritz Hermann; Mustafa Suleyman; Phil; Blunsom

arXiv:1506.02516·cs.NE·November 4, 2015·138 cites

Learning to Transduce with Unbounded Memory

Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil, Blunsom

PDF

Open Access 4 Repos

TL;DR

This paper introduces memory-augmented recurrent neural networks with differentiable data structures, demonstrating improved generalization and the ability to learn underlying algorithms in language transduction tasks.

Contribution

It proposes new neural architectures with differentiable stacks, queues, and deques, enhancing the representational power and generalization of recurrent models for transduction.

Findings

01

Memory-augmented networks outperform standard RNNs in experiments.

02

Differentiable data structures enable learning of underlying algorithms.

03

Models generalize well to synthetic and real transduction tasks.

Abstract

Recently, strong results have been demonstrated by Deep Recurrent Neural Networks on natural language transduction problems. In this paper we explore the representational power of these models using synthetic grammars designed to exhibit phenomena similar to those found in real transduction problems such as machine translation. These experiments lead us to propose new memory-based recurrent networks that implement continuously differentiable analogues of traditional data structures such as Stacks, Queues, and DeQues. We show that these architectures exhibit superior generalisation performance to Deep RNNs and are often able to learn the underlying generating algorithms in our transduction experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression