Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play
Sourabh Majumdar, Serra Sinem Tekiroglu, Marco Guerini

TL;DR
This paper introduces a method to generate challenge datasets for task-oriented conversational agents using dialogue self-play, enabling evaluation of neural models on unseen and complex dialogue patterns.
Contribution
The paper presents a novel approach to create synthetic challenge datasets through dialogue self-play, improving interpretability and robustness testing of neural conversational models.
Findings
Neural models show varied performance on different dialogue patterns.
Synthetic challenge datasets reveal strengths and weaknesses of models.
Self-play enables controlled generation of diverse dialogue scenarios.
Abstract
End-to-end neural approaches are becoming increasingly common in conversational scenarios due to their promising performances when provided with sufficient amount of data. In this paper, we present a novel methodology to address the interpretability of neural approaches in such scenarios by creating challenge datasets using dialogue self-play over multiple tasks/intents. Dialogue self-play allows generating large amount of synthetic data; by taking advantage of the complete control over the generation process, we show how neural approaches can be evaluated in terms of unseen dialogue patterns. We propose several out-of-pattern test cases each of which introduces a natural and unexpected user utterance phenomenon. As a proof of concept, we built a single and a multiple memory network, and show that these two architectures have diverse performances depending on the peculiar dialogue…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
MethodsTest · Interpretability
