Training Zero-Shot Generalizable End-to-End Task-Oriented Dialog System   Without Turn-level Dialog Annotations

Adib Mosharrof; A.B. Siddique

arXiv:2407.15055·cs.CL·November 5, 2024

Training Zero-Shot Generalizable End-to-End Task-Oriented Dialog System Without Turn-level Dialog Annotations

Adib Mosharrof, A.B. Siddique

PDF

Open Access

TL;DR

This paper introduces a novel training method for task-oriented dialogue systems that eliminates the need for turn-level annotations, enabling scalable, domain-generalizable, and autonomous external information retrieval using large language models.

Contribution

It presents a multi-task instruction fine-tuning approach that trains dialogue systems without manual annotations, outperforming existing models and enabling better generalization across domains.

Findings

01

Outperforms state-of-the-art models trained on annotated data

02

Generalizes effectively to unseen domains

03

Leverages large language models without manual turn-level annotations

Abstract

Task-oriented dialogue (TOD) systems enable users to achieve their goals through natural language interactions. Traditionally, these systems have relied on turn-level manually annotated metadata, such as dialogue states and policy annotations, which are expensive, time-consuming, and often inconsistent or error-prone. This dependence limits the potential to leverage vast amounts of readily available conversational data for training TOD systems. Additionally, a critical challenge in TOD system design is determining when and how to access and integrate information from external sources. Current approaches typically expect this information to be provided alongside the dialogue context, rather than learning to identify and retrieve it autonomously. While pre-trained large language models (LLMs) have been used to develop TOD systems, their potential to train such systems without laborious…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · AI in Service Interactions

MethodsStochastic Gradient Descent