Hierarchical Memory Networks
Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald, Tesauro, Yoshua Bengio

TL;DR
This paper introduces a hierarchical memory network that combines the efficiency of hard and soft attention mechanisms, utilizing Maximum Inner Product Search to enable scalable and trainable memory access for large-scale question answering.
Contribution
It proposes a novel hierarchical memory architecture with integrated MIPS techniques, improving scalability and trainability over traditional flat memory networks.
Findings
Achieves scalable memory access with less computation.
Easier training compared to hard attention models.
Demonstrates effectiveness on large-scale question answering.
Abstract
Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation possible. However, this is not computationally scalable for applications which require the network to read from extremely large memories. On the other hand, it is well known that hard attention mechanisms based on reinforcement learning are challenging to train successfully. In this paper, we explore a form of hierarchical memory network, which can be considered as a hybrid between hard and soft attention memory networks. The memory is organized in a hierarchical structure such that reading from it is done with less computation than soft attention over a flat memory, while also being easier to train than hard attention over a flat memory.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning and Algorithms
MethodsSoftmax
