Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition

Christian Huber; Alexander Waibel

arXiv:2506.18703·cs.CL·March 5, 2026

Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition

Christian Huber, Alexander Waibel

PDF

TL;DR

This paper introduces a context biasing method for speech recognition that allows on-the-fly corrections to handle pronunciation-orthography mismatches, significantly improving recognition accuracy for challenging words.

Contribution

The proposed method enables real-time correction of substitution errors, enhancing recognition of unseen words without degrading overall system performance.

Findings

01

Achieved 22-34% relative improvement in biased word error rate

02

Effective handling of pronunciation-orthography mismatches

03

Maintains overall recognition performance

Abstract

Neural sequence-to-sequence systems deliver state-of-the-art performance for automatic speech recognition. When using appropriate modeling units, e.g., byte-pair encoding, these systems are in principle open vocabulary systems. In practice, however, they often fail to recognize words not seen during training, e.g., named entities, acronyms, or domain-specific special words. To address this problem, many context biasing methods have been proposed; however, these methods may still struggle when they are unable to relate audio and corresponding text, e.g., in case of a pronunciation-orthography mismatch. We propose a method where corrections of substitution errors can be used to improve the recognition accuracy of such challenging words. Users can add corrections on the fly during inference. We show that with this method we get a relative improvement in biased word error rate between 22%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.