Arabic Speech Recognition by End-to-End, Modular Systems and Human

Amir Hussein; Shinji Watanabe; Ahmed Ali

arXiv:2101.08454·eess.AS·June 30, 2021

Arabic Speech Recognition by End-to-End, Modular Systems and Human

Amir Hussein, Shinji Watanabe, Ahmed Ali

PDF

1 Repo

TL;DR

This study benchmarks Arabic speech recognition systems, comparing end-to-end transformer models, modular HMM-DNN systems, and human performance, revealing humans still outperform machines with a notable WER gap.

Contribution

It provides the first comprehensive benchmarking of Arabic speech recognition systems and human performance across dialects, highlighting current limitations of machine models.

Findings

01

End-to-end transformer ASR achieved WERs of 12.5%, 27.5%, 33.8%.

02

Humans outperform machines with an average 3.5% WER gap.

03

New dataset for human speech recognition in Arabic dialects.

Abstract

Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition (HSR) on the Arabic language and its dialects. For the HSR, we evaluate linguist performance and lay-native speaker performance on a new dataset collected as a part of this study. For ASR the end-to-end work led to 12.5%, 27.5%, 33.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

espnet/espnet/tree/master/egs/mgb2/asr1
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.