Conditioned Query Generation for Task-Oriented Dialogue Systems

St\'ephane d'Ascoli; Alice Coucke; Francesco Caltagirone; Alexandre; Caulier; Marc Lelarge

arXiv:1911.03698·cs.CL·November 12, 2019·1 cites

Conditioned Query Generation for Task-Oriented Dialogue Systems

St\'ephane d'Ascoli, Alice Coucke, Francesco Caltagirone, Alexandre, Caulier, Marc Lelarge

PDF

Open Access 1 Repo

TL;DR

This paper introduces a controlled data generation method using a conditional variational autoencoder and a novel query transfer protocol to enhance training data diversity for task-oriented dialogue systems, reducing reliance on manual annotation.

Contribution

It presents a new approach for intent-specific sentence generation and a protocol to leverage unlabelled data, improving dialogue query diversity.

Findings

01

Enhanced query diversity without quality loss

02

Effective use of unlabelled data through query transfer

03

Consistent improvement over baseline methods

Abstract

Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. In this paper we propose a novel controlled data generation method that could be used as a training augmentation framework for closed-domain dialogue. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a novel protocol called query transfer that allows to leverage a broad, unlabelled dataset to extract relevant information. Comparison with two different baselines shows that our method, in the appropriate regime, consistently improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

snipsco/automatic-data-generation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems