Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali

TL;DR
This paper presents a large multilingual end-to-end speech recognition model that handles multiple languages, dialects, and code-switching scenarios, outperforming existing monolingual systems in Arabic dialectal and code-switching ASR tasks.
Contribution
The study introduces a unified self-attention based conformer model trained on Arabic, English, and French, effectively managing dialectal variation and code-switching in speech recognition.
Findings
Outperforms state-of-the-art monolingual dialectal Arabic ASR
Effective in cross-lingual and dialectal code-switching scenarios
Demonstrates robustness across multiple languages and dialects
Abstract
With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. We evaluate the system performance handling: (i) monolingual (Ar, En and Fr); (ii) multi-dialectal (Modern Standard Arabic, along with dialectal variation such as Egyptian and Moroccan); (iii) code-switching -- cross-lingual (Ar-En/Fr) and dialectal (MSA-Egyptian dialect) test cases, and compare with current state-of-the-art systems. Furthermore, we investigate the influence of different embedding/character representations including character vs word-piece; shared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
