BlasBench: An Open Benchmark for Irish Speech Recognition
Jyoutir Raj, John Conway

TL;DR
BlasBench introduces an Irish-aware normalisation and evaluation framework for speech recognition, enabling reliable benchmarking of diverse systems on Irish speech datasets.
Contribution
It provides the first Irish-specific normaliser and reproducible evaluation harness for speech recognition benchmarking.
Findings
All Whisper variants exceed 100% WER due to hallucination.
Microsoft Azure achieves 22.2% WER on Common Voice.
Open model Omnilingual ASR 7B achieves 30.65% WER on Common Voice.
Abstract
Existing multilingual benchmarks include Irish among dozens of languages but apply no Irish-aware text normalisation, leaving reliable and reproducible ASR comparison impossible. We introduce BlasBench, an open evaluation harness that provides a standalone Irish-aware normaliser preserving fadas, lenition, and eclipsis; a reproducible scoring harness and per-utterance predictions released for all evaluated runs. We pilot this by benchmarking 12 systems across four architecture families on Common Voice ga-IE and FLEURS ga-IE. All Whisper variants exceed 100% WER through insertion-driven hallucination. Microsoft Azure reaches 22.2% WER on Common Voice and 57.5% on FLEURS; the best open model, Omnilingual ASR 7B, reaches 30.65% and 39.09% respectively. Models fine-tuned on Common Voice degrade 33-43 points moving to FLEURS, while massively multilingual models degrade only 7-10 - a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
