Generating Challenge Datasets for Task-Oriented Conversational Agents   through Self-Play

Sourabh Majumdar; Serra Sinem Tekiroglu; Marco Guerini

arXiv:1910.07357·cs.CL·October 17, 2019·1 cites

Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play

Sourabh Majumdar, Serra Sinem Tekiroglu, Marco Guerini

PDF

Open Access

TL;DR

This paper introduces a method to generate challenge datasets for task-oriented conversational agents using dialogue self-play, enabling evaluation of neural models on unseen and complex dialogue patterns.

Contribution

The paper presents a novel approach to create synthetic challenge datasets through dialogue self-play, improving interpretability and robustness testing of neural conversational models.

Findings

01

Neural models show varied performance on different dialogue patterns.

02

Synthetic challenge datasets reveal strengths and weaknesses of models.

03

Self-play enables controlled generation of diverse dialogue scenarios.

Abstract

End-to-end neural approaches are becoming increasingly common in conversational scenarios due to their promising performances when provided with sufficient amount of data. In this paper, we present a novel methodology to address the interpretability of neural approaches in such scenarios by creating challenge datasets using dialogue self-play over multiple tasks/intents. Dialogue self-play allows generating large amount of synthetic data; by taking advantage of the complete control over the generation process, we show how neural approaches can be evaluated in terms of unseen dialogue patterns. We propose several out-of-pattern test cases each of which introduces a natural and unexpected user utterance phenomenon. As a proof of concept, we built a single and a multiple memory network, and show that these two architectures have diverse performances depending on the peculiar dialogue…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications

MethodsTest · Interpretability