PERL: Pinyin Enhanced Rephrasing Language Model for Chinese ASR N-best Error Correction
Junhong Liang, Bojun Zhang

TL;DR
This paper introduces PERL, a Pinyin-enhanced rephrasing language model that significantly improves Chinese ASR N-best error correction by leveraging phonetic information, achieving substantial reductions in character error rates on multiple datasets.
Contribution
The paper presents a novel Pinyin-enhanced language model specifically designed for Chinese ASR correction, effectively utilizing phonetic features for improved accuracy.
Findings
Achieves 29.11% CER reduction on Aishell-1
Reaches around 70% CER reduction on domain-specific datasets
Demonstrates low latency and effective correction of wrong characters
Abstract
Existing Chinese ASR correction methods have not effectively utilized Pinyin information, a unique feature of the Chinese language. In this study, we address this gap by proposing a \textbf{P}inyin \textbf{E}nhanced \textbf{R}ephrasing \textbf{L}anguage model (PERL) pipeline, designed explicitly for N-best correction scenarios. We conduct experiments on the Aishell-1 dataset and our newly proposed DoAD dataset. The results show that our approach outperforms baseline methods, achieving a 29.11\% reduction in Character Error Rate on Aishell-1 and around 70\% CER reduction on domain-specific datasets. PERL predicts the correct length of the output, leveraging the Pinyin information, which is embedded with a semantic model to perform phonetically similar corrections. Extensive experiments demonstrate the effectiveness of correcting wrong characters using N-best output and the low latency of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
