Task-Oriented Dialogue System as Natural Language Generation

Weizhi Wang; Zhirui Zhang; Junliang Guo; Yinpei Dai; Boxing Chen and; Weihua Luo

arXiv:2108.13679·cs.CL·April 26, 2022

Task-Oriented Dialogue System as Natural Language Generation

Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen and, Weihua Luo

PDF

1 Repo

TL;DR

This paper reformulates task-oriented dialogue systems as natural language generation tasks using GPT-2, introducing a novel GPT-Adapter-CopyNet to improve transfer learning and entity generation, achieving superior results on benchmarks.

Contribution

It introduces GPT-Adapter-CopyNet, a new model combining adapters and CopyNet with GPT-2, to address dialogue entity inconsistency and catastrophic forgetting in dialogue systems.

Findings

01

Significantly outperforms baseline models on DSTC8 and MultiWOZ datasets.

02

Achieves higher automatic and human evaluation scores.

03

Effectively handles dialogue entity generation and transfer learning.

Abstract

In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing. However, directly applying this method heavily suffers from the dialogue entity inconsistency caused by the removal of delexicalized tokens, as well as the catastrophic forgetting problem of the pre-trained model during fine-tuning, leading to unsatisfactory performance. To alleviate these problems, we design a novel GPT-Adapter-CopyNet network, which incorporates the lightweight adapter and CopyNet modules into GPT-2 to achieve better performance on transfer learning and dialogue entity generation. Experimental results conducted on the DSTC8 Track 1 benchmark and MultiWOZ dataset demonstrate that our proposed approach significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

victorwz/tod_as_nlg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Residual Connection · Layer Normalization · Dense Connections · Attention Dropout · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning