Effective Data Augmentation Approaches to End-to-End Task-Oriented Dialogue
Jun Quan, Deyi Xiong

TL;DR
This paper introduces four automatic data augmentation methods at word and sentence levels for end-to-end task-oriented dialogue systems, significantly improving performance and robustness without costly manual annotation.
Contribution
It presents novel automatic data augmentation approaches and demonstrates their effectiveness in enhancing dialogue system performance on benchmark datasets.
Findings
All four augmentation methods improve Success F1 scores.
Ensemble of methods achieves state-of-the-art results.
Methods increase diversity and robustness of user utterances.
Abstract
The training of task-oriented dialogue systems is often confronted with the lack of annotated data. In contrast to previous work which augments training data through expensive crowd-sourcing efforts, we propose four different automatic approaches to data augmentation at both the word and sentence level for end-to-end task-oriented dialogue and conduct an empirical study on their impact. Experimental results on the CamRest676 and KVRET datasets demonstrate that each of the four data augmentation approaches is able to obtain a significant improvement over a strong baseline in terms of Success F1 score and that the ensemble of the four approaches achieves the state-of-the-art results in the two datasets. In-depth analyses further confirm that our methods adequately increase the diversity of user utterances, which enables the end-to-end model to learn features robustly.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
