A Simple Language Model for Task-Oriented Dialogue
Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz, Richard, Socher

TL;DR
This paper introduces SimpleTOD, a unified causal language model approach for task-oriented dialogue that achieves state-of-the-art results by leveraging transfer learning and treating all sub-tasks as sequence prediction.
Contribution
The paper presents SimpleTOD, a simple yet effective unified model for dialogue tasks that outperforms previous specialized models on the MultiWOZ dataset.
Findings
Achieves state-of-the-art joint goal accuracy in dialogue state tracking.
Improves inform rate by 8.1 points and success rate by 9.7 points.
Demonstrates robustness to noisy annotations.
Abstract
Task-oriented dialogue is often decomposed into three tasks: understanding user input, deciding actions, and generating a response. While such decomposition might suggest a dedicated model for each sub-task, we find a simple, unified approach leads to state-of-the-art performance on the MultiWOZ dataset. SimpleTOD is a simple approach to task-oriented dialogue that uses a single, causal language model trained on all sub-tasks recast as a single sequence prediction problem. This allows SimpleTOD to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2. SimpleTOD improves over the prior state-of-the-art in joint goal accuracy for dialogue state tracking, and our analysis reveals robustness to noisy annotations in this setting. SimpleTOD also improves the main metrics used to evaluate action decisions and response generation in an end-to-end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsLinear Layer · Cosine Annealing · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Discriminative Fine-Tuning
