Recurrent Neural Networks with External Memory for Language   Understanding

Baolin Peng; Kaisheng Yao

arXiv:1506.00195·cs.CL·June 2, 2015·35 cites

Recurrent Neural Networks with External Memory for Language Understanding

Baolin Peng, Kaisheng Yao

PDF

Open Access

TL;DR

This paper introduces an external memory component to recurrent neural networks to enhance their ability to memorize long-term dependencies, leading to improved language understanding performance.

Contribution

The paper proposes a novel RNN architecture with external memory, addressing the limited memory capacity of traditional RNNs for language understanding tasks.

Findings

01

Achieved state-of-the-art results on the ATIS dataset

02

Demonstrated improved memorization capabilities with external memory

03

Provided analysis insights for future research directions

Abstract

Recurrent Neural Networks (RNNs) have become increasingly popular for the task of language understanding. In this task, a semantic tagger is deployed to associate a semantic label to each word in an input sequence. The success of RNN may be attributed to its ability to memorize long-term dependence that relates the current-time semantic label prediction to the observations many time instances away. However, the memory capacity of simple RNNs is limited because of the gradient vanishing and exploding problem. We propose to use an external memory to improve memorization capability of RNNs. We conducted experiments on the ATIS dataset, and observed that the proposed model was able to achieve the state-of-the-art results. We compare our proposed model with alternative models and report analysis results that may provide insights for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications