Prompt-tuning in ASR systems for efficient domain-adaptation

Saket Dingliwal; Ashish Shenoy; Sravan Bodapati; Ankur Gandhe; Ravi; Teja Gadde; Katrin Kirchhoff

arXiv:2110.06502·cs.CL·October 26, 2021

Prompt-tuning in ASR systems for efficient domain-adaptation

Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi, Teja Gadde, Katrin Kirchhoff

PDF

Open Access

TL;DR

This paper introduces prompt-tuning for transformer-based language models in ASR systems, enabling efficient domain adaptation with minimal parameters and achieving comparable performance to full fine-tuning.

Contribution

It proposes a prompt-tuning approach that adapts large language models for domain-specific ASR with significantly fewer parameters than traditional methods.

Findings

01

Prompt-tuning improves perplexity scores over unadapted LMs.

02

Parameter-efficient adaptation achieves results similar to full fine-tuning.

03

Enhanced domain-specific ASR performance demonstrated through WER reduction.

Abstract

Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains. Since domain-specific systems perform better than their generic counterparts on in-domain evaluation, the need for memory and compute-efficient domain adaptation is obvious. Particularly, adapting parameter-heavy transformer-based language models used for rescoring ASR hypothesis is challenging. In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain. With just a handful of extra parameters per domain, we achieve much better perplexity scores over the baseline of using an unadapted LM. Despite being parameter-efficient, these improvements are comparable to those of fully-fine-tuned models with hundreds of millions of parameters.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques