A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems
Layla El Asri, Jing He, Kaheer Suleman

TL;DR
This paper presents a neural sequence-to-sequence model for user simulation in spoken dialogue systems, effectively capturing dialogue history and user intentions, and outperforming traditional models on benchmark datasets.
Contribution
Introduces a data-driven encoder-decoder RNN model for user simulation that handles dialogue history and multiple user intentions, improving over prior methods.
Findings
Outperforms agenda-based and n-gram simulators on DSTC2 dataset
Models user behavior with finer granularity using original action space
Achieves higher F-score in user simulation tasks
Abstract
User simulation is essential for generating enough data to train a statistical spoken dialogue system. Previous models for user simulation suffer from several drawbacks, such as the inability to take dialogue history into account, the need of rigid structure to ensure coherent user behaviour, heavy dependence on a specific domain, the inability to output several user intentions during one dialogue turn, or the requirement of a summarized action space for tractability. This paper introduces a data-driven user simulator based on an encoder-decoder recurrent neural network. The model takes as input a sequence of dialogue contexts and outputs a sequence of dialogue acts corresponding to user intentions. The dialogue contexts include information about the machine acts and the status of the user goal. We show on the Dialogue State Tracking Challenge 2 (DSTC2) dataset that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
