Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems
Christos Vlachos, Themos Stafylakis, Ion Androutsopoulos

TL;DR
This paper evaluates various data augmentation techniques for end-to-end task-oriented dialog systems, demonstrating their benefits and providing practical recommendations, especially in few-shot cross-domain scenarios.
Contribution
It systematically compares multiple DA methods in end-to-end ToDSs, highlighting effective techniques and offering guidance for practical implementation.
Findings
All considered DA methods improve system performance.
Word-level and dialog-level DA methods are particularly effective.
DA methods are beneficial even in few-shot cross-domain settings.
Abstract
Creating effective and reliable task-oriented dialog systems (ToDSs) is challenging, not only because of the complex structure of these systems, but also due to the scarcity of training data, especially when several modules need to be trained separately, each one with its own input/output training examples. Data augmentation (DA), whereby synthetic training examples are added to the training data, has been successful in other NLP systems, but has not been explored as extensively in ToDSs. We empirically evaluate the effectiveness of DA methods in an end-to-end ToDS setting, where a single system is trained to handle all processing stages, from user inputs to system outputs. We experiment with two ToDSs (UBAR, GALAXY) on two datasets (MultiWOZ, KVRET). We consider three types of DA methods (word-level, sentence-level, dialog-level), comparing eight DA methods that have shown promising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Service-Oriented Architecture and Web Services
