Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi, Wanhao Zhou, Fengyu Cai, Lingjing Kong, Minlie Huang, and Boi, Faltings

TL;DR
This paper introduces a self-training method combined with a novel text augmentation technique to enhance pre-trained models for few-shot task-oriented dialog systems, leveraging unlabeled data to improve performance.
Contribution
It proposes a self-training framework with GradAug augmentation to improve few-shot learning in dialog systems using unlabeled data.
Findings
Consistent performance improvements on four dialog tasks.
Effective utilization of unlabeled data with self-training.
GradAug enhances training of the Student model.
Abstract
As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
