Prompt-tuning in ASR systems for efficient domain-adaptation
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi, Teja Gadde, Katrin Kirchhoff

TL;DR
This paper introduces prompt-tuning for transformer-based language models in ASR systems, enabling efficient domain adaptation with minimal parameters and achieving comparable performance to full fine-tuning.
Contribution
It proposes a prompt-tuning approach that adapts large language models for domain-specific ASR with significantly fewer parameters than traditional methods.
Findings
Prompt-tuning improves perplexity scores over unadapted LMs.
Parameter-efficient adaptation achieves results similar to full fine-tuning.
Enhanced domain-specific ASR performance demonstrated through WER reduction.
Abstract
Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains. Since domain-specific systems perform better than their generic counterparts on in-domain evaluation, the need for memory and compute-efficient domain adaptation is obvious. Particularly, adapting parameter-heavy transformer-based language models used for rescoring ASR hypothesis is challenging. In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain. With just a handful of extra parameters per domain, we achieve much better perplexity scores over the baseline of using an unadapted LM. Despite being parameter-efficient, these improvements are comparable to those of fully-fine-tuned models with hundreds of millions of parameters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques
