Challenging Neural Dialogue Models with Natural Data: Memory Networks   Fail on Incremental Phenomena

Igor Shalyminov; Arash Eshghi; Oliver Lemon

arXiv:1709.07840·cs.CL·September 25, 2017

Challenging Neural Dialogue Models with Natural Data: Memory Networks Fail on Incremental Phenomena

Igor Shalyminov, Arash Eshghi, Oliver Lemon

PDF

1 Repo

TL;DR

This paper investigates how neural dialogue models trained on clean data struggle with natural, spontaneous dialogue phenomena, demonstrating that linguistically informed models outperform neural models in handling real-world incremental dialogue features.

Contribution

The study introduces a new natural dialogue dataset, bAbI+, and compares neural and linguistic models, revealing the limitations of neural models on natural dialogue phenomena.

Findings

01

MemN2N performance drops significantly on bAbI+

02

Neural models require excessive training data to learn natural phenomena

03

Linguistically informed parser achieves 100% accuracy on both datasets

Abstract

Natural, spontaneous dialogue proceeds incrementally on a word-by-word basis; and it contains many sorts of disfluency such as mid-utterance/sentence hesitations, interruptions, and self-corrections. But training data for machine learning approaches to dialogue processing is often either cleaned-up or wholly synthetic in order to avoid such phenomena. The question then arises of how well systems trained on such clean data generalise to real spontaneous dialogue, or indeed whether they are trainable at all on naturally occurring dialogue data. To answer this question, we created a new corpus called bAbI+ by systematically adding natural spontaneous incremental dialogue phenomena such as restarts and self-corrections to the Facebook AI Research's bAbI dialogues dataset. We then explore the performance of a state-of-the-art retrieval model, MemN2N, on this more natural dataset. Results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ishalyminov/babi_tools
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.