PERL: Pinyin Enhanced Rephrasing Language Model for Chinese ASR N-best Error Correction

Junhong Liang; Bojun Zhang

arXiv:2412.03230·cs.CL·September 23, 2025

PERL: Pinyin Enhanced Rephrasing Language Model for Chinese ASR N-best Error Correction

Junhong Liang, Bojun Zhang

PDF

Open Access

TL;DR

This paper introduces PERL, a Pinyin-enhanced rephrasing language model that significantly improves Chinese ASR N-best error correction by leveraging phonetic information, achieving substantial reductions in character error rates on multiple datasets.

Contribution

The paper presents a novel Pinyin-enhanced language model specifically designed for Chinese ASR correction, effectively utilizing phonetic features for improved accuracy.

Findings

01

Achieves 29.11% CER reduction on Aishell-1

02

Reaches around 70% CER reduction on domain-specific datasets

03

Demonstrates low latency and effective correction of wrong characters

Abstract

Existing Chinese ASR correction methods have not effectively utilized Pinyin information, a unique feature of the Chinese language. In this study, we address this gap by proposing a \textbf{P}inyin \textbf{E}nhanced \textbf{R}ephrasing \textbf{L}anguage model (PERL) pipeline, designed explicitly for N-best correction scenarios. We conduct experiments on the Aishell-1 dataset and our newly proposed DoAD dataset. The results show that our approach outperforms baseline methods, achieving a 29.11\% reduction in Character Error Rate on Aishell-1 and around 70\% CER reduction on domain-specific datasets. PERL predicts the correct length of the output, leveraging the Pinyin information, which is embedded with a semantic model to perform phonetically similar corrections. Extensive experiments demonstrate the effectiveness of correcting wrong characters using N-best output and the low latency of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques