Stabilized In-Context Learning with Pre-trained Language Models for Few   Shot Dialogue State Tracking

Derek Chen; Kun Qian; Zhou Yu

arXiv:2302.05932·cs.CL·February 14, 2023·1 cites

Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking

Derek Chen, Kun Qian, Zhou Yu

PDF

Open Access

TL;DR

This paper introduces a stabilized in-context learning approach for dialogue state tracking using pre-trained language models, employing meta-learning, improved retrieval, and a saliency model to enhance few-shot performance on MultiWOZ.

Contribution

It proposes a novel combination of meta-learning, improved retrieval, and dialogue length control to stabilize and enhance few-shot dialogue state tracking with PLMs.

Findings

01

Achieves highly competitive few-shot DST results on MultiWOZ.

02

Stabilizes model performance across different prompts.

03

Enables inclusion of more exemplars through dialogue length reduction.

Abstract

Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building in-context exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsDynamic Sparse Training