A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive   Pretraining

Boliang Zhang; Ying Lyu; Ning Ding; Tianhao Shen; Zhaoyang Jia; Kun; Han; Kevin Knight

arXiv:2102.04506·cs.CL·February 10, 2021·5 cites

A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining

Boliang Zhang, Ying Lyu, Ning Ding, Tianhao Shen, Zhaoyang Jia, Kun, Han, Kevin Knight

PDF

Open Access

TL;DR

This paper presents an end-to-end task-oriented dialog system using GPT-2 with domain-specific pretraining, heuristic rules, and error correction, achieving top performance in multi-domain dialog challenges.

Contribution

It introduces a novel hybrid approach combining GPT-2 with domain-adaptive pretraining, heuristic rules, and fault tolerance for improved dialog system performance.

Findings

01

Outperforms baseline models in multi-domain dialog tasks

02

Achieves top-tier results in DSTC-9 challenge

03

Demonstrates effectiveness of domain-specific pretraining and error correction

Abstract

This paper describes our submission for the End-to-end Multi-domain Task Completion Dialog shared task at the 9th Dialog System Technology Challenge (DSTC-9). Participants in the shared task build an end-to-end task completion dialog system which is evaluated by human evaluation and a user simulator based automatic evaluation. Different from traditional pipelined approaches where modules are optimized individually and suffer from cascading failure, we propose an end-to-end dialog system that 1) uses Generative Pretraining 2 (GPT-2) as the backbone to jointly solve Natural Language Understanding, Dialog State Tracking, and Natural Language Generation tasks, 2) adopts Domain and Task Adaptive Pretraining to tailor GPT-2 to the dialog domain before finetuning, 3) utilizes heuristic pre/post-processing rules that greatly simplify the prediction tasks and improve generalizability, and 4)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsLinear Layer · Cosine Annealing · Residual Connection · Layer Normalization · Attention Dropout · Discriminative Fine-Tuning · Multi-Head Attention · Adam · Linear Warmup With Cosine Annealing · Weight Decay