Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests
Amogh Mannekote, Jinseok Nam, Ziming Li, Jian Gao, Kristy Elizabeth, Boyer, Bonnie J. Dorr

TL;DR
This paper introduces a method for synthetically generating indirect user requests to enhance task-oriented dialogue datasets, aiming to improve the evaluation and training of smaller models in understanding complex discourse phenomena.
Contribution
It proposes linguistic criteria and an LLM-based pipeline for creating realistic indirect requests, and releases a new dataset for benchmarking model performance on these requests.
Findings
Generated IURs improve model robustness in understanding indirect requests.
The dataset enables better evaluation of smaller models' NLU and DST capabilities.
The pipeline facilitates creating more natural dialogue data with complex discourse features.
Abstract
Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. Moreover, existing task-oriented dialogue benchmarks lack sufficient examples of complex discourse phenomena such as indirectness. To address this, we propose a set of linguistic criteria along with an LLM-based pipeline for generating realistic IURs to test natural language understanding (NLU) and dialogue state tracking (DST) models before deployment in a new domain. We also release IndirectRequests, a dataset of IURs based on the Schema Guided Dialog (SGD) corpus, as a comparative testbed for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Context-Aware Activity Recognition Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Sparse Evolutionary Training · Cosine Annealing · Residual Connection · Softmax · Layer Normalization · 15 Ways to Contact How can i speak to someone at Delta Airlines · Byte Pair Encoding · Label Smoothing · Adam
