Towards spoken dialect identification of Irish
Liam Lonergan, Mengjie Qian, Neasa N\'i Chiar\'ain, Christer Gobl,, Ailbhe N\'i Chasaide

TL;DR
This paper explores automatic spoken dialect identification for Irish using acoustic and text-based models, achieving up to 76% accuracy, with particular success in identifying the Ulster dialect.
Contribution
It introduces a combined acoustic and text-based approach for Irish dialect identification, demonstrating improved accuracy over previous methods.
Findings
ECAPA-TDNN model pretrained on VoxLingua107 performed best.
Fusion of acoustic and text models increased accuracy to 76%.
Ulster dialect was identified with 94% accuracy, but other dialects remain challenging.
Abstract
The Irish language is rich in its diversity of dialects and accents. This compounds the difficulty of creating a speech recognition system for the low-resource language, as such a system must contend with a high degree of variability with limited corpora. A recent study investigating dialect bias in Irish ASR found that balanced training corpora gave rise to unequal dialect performance, with performance for the Ulster dialect being consistently worse than for the Connacht or Munster dialects. Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline. Two acoustic classification models are tested, XLS-R and ECAPA-TDNN, in conjunction with a text-based classifier using a pretrained Irish-language BERT model. The ECAPA-TDNN, particularly a model pretrained for language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Phonetics and Phonology Research
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Residual Connection · Adam · Dense Connections · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia?
