Memory-Augmented Recurrent Networks for Dialogue Coherence
David Donahue, Yuanliang Meng, Anna Rumshisky

TL;DR
This paper introduces memory-augmented recurrent networks using Neural Turing Machines to improve dialogue coherence by providing flexible, expandable memory storage, and compares two architectures with baseline models based on perplexity performance.
Contribution
It proposes novel dialogue architectures utilizing NTMs for better memory management, enabling longer context understanding and more coherent conversations.
Findings
NTM-based models outperform traditional fixed-size vector models.
The second architecture with a neural language model achieves lower perplexity.
Memory-augmented networks show promise for maintaining dialogue coherence.
Abstract
Recent dialogue approaches operate by reading each word in a conversation history, and aggregating accrued dialogue information into a single state. This fixed-size vector is not expandable and must maintain a consistent format over time. Other recent approaches exploit an attention mechanism to extract useful information from past conversational utterances, but this introduces an increased computational complexity. In this work, we explore the use of the Neural Turing Machine (NTM) to provide a more permanent and flexible storage mechanism for maintaining dialogue coherence. Specifically, we introduce two separate dialogue architectures based on this NTM design. The first design features a sequence-to-sequence architecture with two separate NTM modules, one for each participant in the conversation. The second memory architecture incorporates a single NTM module, which stores parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsSoftmax · Sigmoid Activation · Tanh Activation · Neural Turing Machine · Location-based Attention · Content-based Attention · Long Short-Term Memory
