End-to-End Automatic Speech Recognition model for the Sudanese Dialect

Ayman Mansour; Wafaa F. Mukhtar

arXiv:2212.10826·cs.CL·December 22, 2022·1 cites

End-to-End Automatic Speech Recognition model for the Sudanese Dialect

Ayman Mansour, Wafaa F. Mukhtar

PDF

Open Access

TL;DR

This paper explores the development of an end-to-end speech recognition model for the Sudanese dialect, addressing resource scarcity and demonstrating a baseline with a 73.67% label error rate.

Contribution

It introduces a novel dataset for the Sudanese dialect and proposes a CNN-based end-to-end speech recognition model tailored for this underrepresented language variant.

Findings

01

Achieved an average Label Error Rate of 73.67%.

02

Constructed a modest Sudanese dialect dataset.

03

Provided insights into recognition challenges for the dialect.

Abstract

Designing a natural voice interface rely mostly on Speech recognition for interaction between human and their modern digital life equipment. In addition, speech recognition narrows the gap between monolingual individuals to better exchange communication. However, the field lacks wide support for several universal languages and their dialects, while most of the daily conversations are carried out using them. This paper comes to inspect the viability of designing an Automatic Speech Recognition model for the Sudanese dialect, which is one of the Arabic Language dialects, and its complexity is a product of historical and social conditions unique to its speakers. This condition is reflected in both the form and content of the dialect, so this paper gives an overview of the Sudanese dialect and the tasks of collecting represented resources and pre-processing performed to construct a modest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques

MethodsConvolution