Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking
Derek Chen, Kun Qian, Zhou Yu

TL;DR
This paper introduces a stabilized in-context learning approach for dialogue state tracking using pre-trained language models, employing meta-learning, improved retrieval, and a saliency model to enhance few-shot performance on MultiWOZ.
Contribution
It proposes a novel combination of meta-learning, improved retrieval, and dialogue length control to stabilize and enhance few-shot dialogue state tracking with PLMs.
Findings
Achieves highly competitive few-shot DST results on MultiWOZ.
Stabilizes model performance across different prompts.
Enables inclusion of more exemplars through dialogue length reduction.
Abstract
Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building in-context exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsDynamic Sparse Training
