ASR Error Correction using Large Language Models
Rao Ma, Mengjie Qian, Mark Gales, Kate Knill

TL;DR
This paper explores leveraging large language models for error correction in ASR transcriptions, utilizing N-best lists and constrained decoding to improve accuracy across diverse domains and ASR systems.
Contribution
It introduces a novel approach using LLMs with N-best lists and constrained decoding for domain-agnostic, zero-shot ASR error correction without retraining models.
Findings
Improved correction accuracy on multiple datasets
Effective zero-shot correction with LLMs like ChatGPT
Enhanced model ensembling capabilities
Abstract
Error correction (EC) models play a crucial role in refining Automatic Speech Recognition (ASR) transcriptions, enhancing the readability and quality of transcriptions. Without requiring access to the underlying code or model weights, EC can improve performance and provide domain adaptation for black-box ASR systems. This work investigates the use of large language models (LLMs) for error correction across diverse scenarios. 1-best ASR hypotheses are commonly used as the input to EC models. We propose building high-performance EC models using ASR N-best lists which should provide more contextual information for the correction process. Additionally, the generation process of a standard EC model is unrestricted in the sense that any output sequence can be generated. For some scenarios, such as unseen domains, this flexibility may impact performance. To address this, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
