Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems

Andrea Madotto; Zihan Liu; Zhaojiang Lin; Pascale Fung

arXiv:2008.06239·cs.CL·August 21, 2020·36 cites

Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems

Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung

PDF

Open Access

TL;DR

This paper evaluates the ability of large language models to perform task-oriented dialogue system modules with few-shot learning through priming, highlighting limitations and future implications.

Contribution

It provides an assessment of language models' few-shot capabilities across dialogue system modules and discusses current limitations and future directions.

Findings

01

Language models can perform NLU, DST, DP, and NLG tasks with few examples.

02

Current limitations include task-specific performance gaps and data efficiency issues.

03

Discussion on future research directions for improving few-shot dialogue systems.

Abstract

Task-oriented dialogue systems use four connected modules, namely, Natural Language Understanding (NLU), a Dialogue State Tracking (DST), Dialogue Policy (DP) and Natural Language Generation (NLG). A research challenge is to learn each module with the least amount of samples (i.e., few-shots) given the high cost related to the data collection. The most common and effective technique to solve this problem is transfer learning, where large language models, either pre-trained on text or task-specific data, are fine-tuned on the few samples. These methods require fine-tuning steps and a set of parameters for each task. Differently, language models, such as GPT-2 (Radford et al., 2019) and GPT-3 (Brown et al., 2020), allow few-shot learning by priming the model with few examples. In this paper, we evaluate the priming few-shot ability of language models in the NLU, DST, DP and NLG tasks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsDynamic Sparse Training · Linear Layer · Cosine Annealing · Discriminative Fine-Tuning · Weight Decay · Softmax · {Dispute@FaQ-s}How to file a dispute with Expedia? · Adam · Linear Warmup With Cosine Annealing · Dense Connections