
TL;DR
This paper develops a multi-dialect Arabic speech recognition system using deep neural networks, creating a large normalized corpus and achieving state-of-the-art accuracy with a convolutional-recurrent architecture.
Contribution
It introduces a large multi-dialect Arabic speech corpus and a novel deep learning framework that outperforms previous systems in accuracy.
Findings
Achieved a 14% error rate in speech recognition.
Developed a normalized, multi-dialect Arabic speech corpus.
Implemented a convolutional-recurrent neural network architecture.
Abstract
This paper presents the design and development of multi-dialect automatic speech recognition for Arabic. Deep neural networks are becoming an effective tool to solve sequential data problems, particularly, adopting an end-to-end training of the system. Arabic speech recognition is a complex task because of the existence of multiple dialects, non-availability of large corpora, and missing vocalization. Thus, the first contribution of this work is the development of a large multi-dialectal corpus with either full or at least partially vocalized transcription. Additionally, the open-source corpus has been gathered from multiple sources that bring non-standard Arabic alphabets in transcription which are normalized by defining a common character-set. The second contribution is the development of a framework to train an acoustic model achieving state-of-the-art performance. The network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
