Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian

Yun Hao; Reihaneh Amooie; Wietse de Vries; Rik van Noord; Martijn Wieling

arXiv:2605.19711·cs.CL·May 20, 2026

Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian

Yun Hao, Reihaneh Amooie, Wietse de Vries, Rik van Noord, Martijn Wieling

PDF

TL;DR

This study evaluates the effectiveness of large language models in correcting errors in low-resource Frisian ASR, demonstrating genuine improvements and analyzing correction patterns while controlling for data contamination.

Contribution

It provides the first comprehensive analysis of LLM-based error correction in low-resource ASR, including contamination control and detailed error analysis.

Findings

01

GER improves ASR performance in low-resource Frisian.

02

GPT-5.1 surpasses oracle WERs in correction accuracy.

03

Improvements are consistent across public and offline datasets.

Abstract

Automatic speech recognition (ASR) has improved substantially in recent years, yet performance remains limited for low-resource languages. Large language models (LLMs) have shown promise for improving ASR through generative error correction (GER), but their effectiveness in low-resource settings remains underexplored. In addition, it remains unclear to what extent data contamination influences the reported improvements in LLM-based GER. This study investigates LLM-based GER for low-resource Frisian. In addition to a public corpus, we construct and use a Frisian offline dataset with non-public texts for evaluation to control for potential data contamination. Results show that GER improves ASR performance in most settings, with the best GPT-5.1 results surpassing oracle WERs. Comparable gains on the offline dataset indicate that improvements reflect true correction ability. We further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.