A Light-weight contextual spelling correction model for customizing   transducer-based speech recognition systems

Xiaoqiang Wang; Yanqing Liu; Sheng Zhao; Jinyu Li

arXiv:2108.07493·cs.CL·August 29, 2021·1 cites

A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems

Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li

PDF

Open Access

TL;DR

This paper presents a lightweight contextual spelling correction model that enhances transducer-based speech recognition by effectively incorporating dynamic context information, achieving significant error reduction and handling out-of-vocabulary terms.

Contribution

The work introduces a novel, efficient spelling correction model with a shared context encoder and filtering algorithm, improving ASR accuracy and out-of-vocabulary handling.

Findings

01

50% relative word error rate reduction

02

Outperforms contextual LM biasing methods

03

Effective on out-of-vocabulary terms

Abstract

It's challenging to customize transducer-based automatic speech recognition (ASR) system with context information which is dynamic and unavailable during model training. In this work, we introduce a light-weight contextual spelling correction model to correct context-related recognition errors in transducer-based ASR systems. We incorporate the context information into the spelling correction model with a shared context encoder and use a filtering algorithm to handle large-size context lists. Experiments show that the model improves baseline ASR model performance with about 50% relative word error rate reduction, which also significantly outperforms the baseline method such as contextual LM biasing. The model also shows excellent performance for out-of-vocabulary terms not seen during training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling