Style Attuned Pre-training and Parameter Efficient Fine-tuning for   Spoken Language Understanding

Jin Cao; Jun Wang; Wael Hamza; Kelly Vanee; Shang-Wen Li

arXiv:2010.04355·cs.CL·October 12, 2020·5 cites

Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

Jin Cao, Jun Wang, Wael Hamza, Kelly Vanee, Shang-Wen Li

PDF

Open Access

TL;DR

This paper presents a novel SLU framework that combines conversational language modeling pre-training with a lightweight encoder, enabling efficient domain adaptation and matching state-of-the-art performance with minimal additional parameters.

Contribution

The proposed framework introduces a conversational language modeling pre-training task and a light encoder architecture, improving domain adaptation efficiency in spoken language understanding.

Findings

01

Achieves state-of-the-art results on multiple SLU datasets.

02

Adds only 4.4% parameters per domain adaptation.

03

Effectively captures conversational language with ASR errors.

Abstract

Neural models have yielded state-of-the-art results in deciphering spoken language understanding (SLU) problems; however, these models require a significant amount of domain-specific labeled examples for training, which is prohibitively expensive. While pre-trained language models like BERT have been shown to capture a massive amount of knowledge by learning from unlabeled corpora and solve SLU using fewer labeled examples for adaption, the encoding of knowledge is implicit and agnostic to downstream tasks. Such encoding results in model inefficiencies in parameter usage: an entirely new model is required for every domain. To address these challenges, we introduce a novel SLU framework, comprising a conversational language modeling (CLM) pre-training task and a light encoder architecture. The CLM pre-training enables networks to capture the representation of the language in conversation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsLinear Layer · Adam · Dense Connections · WordPiece · Multi-Head Attention · Layer Normalization · Linear Warmup With Linear Decay · Attention Dropout · Weight Decay · Dropout