LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context

Natsuo Yamashita; Masaaki Yamamoto; Hiroaki Kokubo; Yohei Kawaguchi

arXiv:2505.17410·cs.SD·May 26, 2025

LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context

Natsuo Yamashita, Masaaki Yamamoto, Hiroaki Kokubo, Yohei Kawaguchi

PDF

1 Repo

TL;DR

This paper introduces a novel LLM-based generative error correction method that enhances rare word correction in speech recognition by using synthetic data and phonetic context, leading to improved accuracy.

Contribution

It presents a new approach combining synthetic data generation and phonetic cues to improve rare word correction in LLM-based GER for ASR.

Findings

01

Improves rare word correction accuracy.

02

Reduces WER and CER in English and Japanese datasets.

03

Mitigates over-correction by integrating phonetic context.

Abstract

Generative error correction (GER) with large language models (LLMs) has emerged as an effective post-processing approach to improve automatic speech recognition (ASR) performance. However, it often struggles with rare or domain-specific words due to limited training data. Furthermore, existing LLM-based GER approaches primarily rely on textual information, neglecting phonetic cues, which leads to over-correction. To address these issues, we propose a novel LLM-based GER approach that targets rare words and incorporates phonetic information. First, we generate synthetic data to contain rare words for fine-tuning the GER model. Second, we integrate ASR's N-best hypotheses along with phonetic context to mitigate over-correction. Experimental results show that our method not only improves the correction of rare words but also reduces the WER and CER across both English and Japanese datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

natsuooo/llm-ger
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729 · Graph Convolutional Network · Gait Emotion Recognition