Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions
Jiwon Suh, Injae Na, Woohwan Jung

TL;DR
This paper introduces a method to improve domain-specific automatic speech recognition by leveraging LLM-generated contextual descriptions, decoder fine-tuning, and context perturbation, achieving better accuracy without altering the core ASR model.
Contribution
It presents a novel approach to enhance domain-specific ASR by using LLM-generated descriptions and additional training techniques, maintaining the original model architecture.
Findings
LLM-generated descriptions outperform human-crafted ones in ASR accuracy
Proposed methods significantly improve domain-specific recognition on real datasets
Decoder fine-tuning and context perturbation further enhance performance
Abstract
End-to-end automatic speech recognition (E2E ASR) systems have significantly improved speech recognition through training on extensive datasets. Despite these advancements, they still struggle to accurately recognize domain specific words, such as proper nouns and technical terminologies. To address this problem, we propose a method to utilize the state-of-the-art Whisper without modifying its architecture, preserving its generalization performance while enabling it to leverage descriptions effectively. Moreover, we propose two additional training techniques to improve the domain specific ASR: decoder fine-tuning, and context perturbation. We also propose a method to use a Large Language Model (LLM) to generate descriptions with simple metadata, when descriptions are unavailable. Our experiments demonstrate that proposed methods notably enhance domain-specific ASR accuracy on real-life…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
