Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer and, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus and, Richard Socher

TL;DR
The paper introduces the dynamic memory network (DMN), a neural architecture that processes language input and questions to generate answers, achieving state-of-the-art results across multiple NLP tasks.
Contribution
The paper presents the DMN, a novel neural network architecture with iterative attention and hierarchical reasoning, capable of end-to-end training for diverse NLP question-answering tasks.
Findings
Achieves state-of-the-art on Facebook's bAbI dataset
Effective for sentiment analysis on Stanford Sentiment Treebank
Performs well on part-of-speech tagging for WSJ-PTB
Abstract
Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook's bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSoftmax · Gated Recurrent Unit · Dynamic Memory Network
