Breaking Walls: Pioneering Automatic Speech Recognition for Central   Kurdish: End-to-End Transformer Paradigm

Abdulhady Abas Abdullah; Hadi Veisi; Tarik Rashid

arXiv:2406.02561·eess.AS·September 10, 2024·1 cites

Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm

Abdulhady Abas Abdullah, Hadi Veisi, Tarik Rashid

PDF

Open Access

TL;DR

This paper develops an end-to-end transformer-based ASR system for Central Kurdish, overcoming resource limitations through corpus collection, transfer learning, language modeling, and external tokenization, achieving competitive accuracy.

Contribution

It introduces a novel approach combining fine-tuning of large pre-trained models, language models, and external tokenization for low-resource Kurdish speech recognition.

Findings

01

Achieved 10.0% WER on validation set

02

Outperformed existing Kurdish ASR models

03

Demonstrated effectiveness of transfer learning and external tokenization

Abstract

End-to-end transformer-based models epitomize the cutting-edge in Automatic Speech Recognition (ASR) systems. Despite their substantial benefits, these models demand extensive training data to perform optimally, presenting a significant challenge for low-resource languages such as Central Kurdish. Addressing this issue requires innovative methods and techniques. This paper aims to develop an ASR system for Intermediate Kurdish by collecting a robust corpus of speech, using the N-GRAM language model, and utilizing an external Kurdish tokenizer for refinement and integration techniques to enhance the model's performance. We collect a comprehensive 100-hour speech corpus from diverse sources. Additionally, applied fine-tuning techniques to our speech corpus on Persian, English, and Arabic pre-trained models, specifically utilizing the xls-r-300m, xls-r-1b, and xls-r-2b Wav2vec 2.0 models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLinguistics and Cultural Studies

MethodsSparse Evolutionary Training