A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining
Boliang Zhang, Ying Lyu, Ning Ding, Tianhao Shen, Zhaoyang Jia, Kun, Han, Kevin Knight

TL;DR
This paper presents an end-to-end task-oriented dialog system using GPT-2 with domain-specific pretraining, heuristic rules, and error correction, achieving top performance in multi-domain dialog challenges.
Contribution
It introduces a novel hybrid approach combining GPT-2 with domain-adaptive pretraining, heuristic rules, and fault tolerance for improved dialog system performance.
Findings
Outperforms baseline models in multi-domain dialog tasks
Achieves top-tier results in DSTC-9 challenge
Demonstrates effectiveness of domain-specific pretraining and error correction
Abstract
This paper describes our submission for the End-to-end Multi-domain Task Completion Dialog shared task at the 9th Dialog System Technology Challenge (DSTC-9). Participants in the shared task build an end-to-end task completion dialog system which is evaluated by human evaluation and a user simulator based automatic evaluation. Different from traditional pipelined approaches where modules are optimized individually and suffer from cascading failure, we propose an end-to-end dialog system that 1) uses Generative Pretraining 2 (GPT-2) as the backbone to jointly solve Natural Language Understanding, Dialog State Tracking, and Natural Language Generation tasks, 2) adopts Domain and Task Adaptive Pretraining to tailor GPT-2 to the dialog domain before finetuning, 3) utilizes heuristic pre/post-processing rules that greatly simplify the prediction tasks and improve generalizability, and 4)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsLinear Layer · Cosine Annealing · Residual Connection · Layer Normalization · Attention Dropout · Discriminative Fine-Tuning · Multi-Head Attention · Adam · Linear Warmup With Cosine Annealing · Weight Decay
