Learning Efficient Algorithms with Hierarchical Attentive Memory

Marcin Andrychowicz; Karol Kurach

arXiv:1602.03218·cs.LG·February 24, 2016·28 cites

Learning Efficient Algorithms with Hierarchical Attentive Memory

Marcin Andrychowicz, Karol Kurach

PDF

Open Access

TL;DR

This paper introduces Hierarchical Attentive Memory (HAM), a memory architecture that enables neural networks to perform efficient, logarithmic-time memory access, facilitating learning of algorithms and data structures from examples.

Contribution

The paper presents HAM, a novel binary-tree-based memory architecture that improves access efficiency and enables neural networks to learn and generalize classic algorithms and data structures.

Findings

01

HAM achieves O(log n) memory access time.

02

LSTM with HAM learns sorting and algorithms from examples.

03

HAM can emulate data structures like stacks and queues.

Abstract

In this paper, we propose and investigate a novel memory architecture for neural networks called Hierarchical Attentive Memory (HAM). It is based on a binary tree with leaves corresponding to memory cells. This allows HAM to perform memory access in O(log n) complexity, which is a significant improvement over the standard attention mechanism that requires O(n) operations, where n is the size of the memory. We show that an LSTM network augmented with HAM can learn algorithms for problems like merging, sorting or binary searching from pure input-output examples. In particular, it learns to sort n numbers in time O(n log n) and generalizes well to input sequences much longer than the ones seen during the training. We also show that HAM can be trained to act like classic data structures: a stack, a FIFO queue and a priority queue.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory