Self-training Improves Pre-training for Few-shot Learning in   Task-oriented Dialog Systems

Fei Mi; Wanhao Zhou; Fengyu Cai; Lingjing Kong; Minlie Huang; and Boi; Faltings

arXiv:2108.12589·cs.CL·August 31, 2021

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

Fei Mi, Wanhao Zhou, Fengyu Cai, Lingjing Kong, Minlie Huang, and Boi, Faltings

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-training method combined with a novel text augmentation technique to enhance pre-trained models for few-shot task-oriented dialog systems, leveraging unlabeled data to improve performance.

Contribution

It proposes a self-training framework with GradAug augmentation to improve few-shot learning in dialog systems using unlabeled data.

Findings

01

Consistent performance improvements on four dialog tasks.

02

Effective utilization of unlabeled data with self-training.

03

GradAug enhances training of the Student model.

Abstract

As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mifei/st-tod
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications