Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm
Abdulhady Abas Abdullah, Hadi Veisi, Tarik Rashid

TL;DR
This paper develops an end-to-end transformer-based ASR system for Central Kurdish, overcoming resource limitations through corpus collection, transfer learning, language modeling, and external tokenization, achieving competitive accuracy.
Contribution
It introduces a novel approach combining fine-tuning of large pre-trained models, language models, and external tokenization for low-resource Kurdish speech recognition.
Findings
Achieved 10.0% WER on validation set
Outperformed existing Kurdish ASR models
Demonstrated effectiveness of transfer learning and external tokenization
Abstract
End-to-end transformer-based models epitomize the cutting-edge in Automatic Speech Recognition (ASR) systems. Despite their substantial benefits, these models demand extensive training data to perform optimally, presenting a significant challenge for low-resource languages such as Central Kurdish. Addressing this issue requires innovative methods and techniques. This paper aims to develop an ASR system for Intermediate Kurdish by collecting a robust corpus of speech, using the N-GRAM language model, and utilizing an external Kurdish tokenizer for refinement and integration techniques to enhance the model's performance. We collect a comprehensive 100-hour speech corpus from diverse sources. Additionally, applied fine-tuning techniques to our speech corpus on Persian, English, and Arabic pre-trained models, specifically utilizing the xls-r-300m, xls-r-1b, and xls-r-2b Wav2vec 2.0 models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistics and Cultural Studies
MethodsSparse Evolutionary Training
