Augmenting Slot Values and Contexts for Spoken Language Understanding   with Pretrained Models

Haitao Lin; Lu Xiang; Yu Zhou; Jiajun Zhang; Chengqing Zong

arXiv:2108.08451·cs.CL·September 3, 2021

Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models

Haitao Lin, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong

PDF

1 Repo

TL;DR

This paper introduces two novel finetuning strategies for pretrained language models to generate diverse augmented data, significantly improving slot filling in spoken language understanding systems.

Contribution

It proposes value-based and context-based augmentation methods leveraging pretrained models, enhancing data diversity for SLU with limited labeled data.

Findings

01

Improved SLU performance on two datasets

02

Generated more diverse training sentences

03

Outperformed existing augmentation methods

Abstract

Spoken Language Understanding (SLU) is one essential step in building a dialogue system. Due to the expensive cost of obtaining the labeled data, SLU suffers from the data scarcity problem. Therefore, in this paper, we focus on data augmentation for slot filling task in SLU. To achieve that, we aim at generating more diverse data based on existing data. Specifically, we try to exploit the latent language knowledge from pretrained language models by finetuning them. We propose two strategies for finetuning process: value-based and context-based augmentation. Experimental results on two public SLU datasets have shown that compared with existing data augmentation methods, our proposed method can generate more diverse sentences and significantly improve the performance on SLU.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaolinandy/slu-aug-prlm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.