Towards One Model to Rule All: Multilingual Strategy for Dialectal   Code-Switching Arabic ASR

Shammur Absar Chowdhury; Amir Hussein; Ahmed Abdelali; Ahmed Ali

arXiv:2105.14779·cs.CL·July 6, 2021

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali

PDF

TL;DR

This paper presents a large multilingual end-to-end speech recognition model that handles multiple languages, dialects, and code-switching scenarios, outperforming existing monolingual systems in Arabic dialectal and code-switching ASR tasks.

Contribution

The study introduces a unified self-attention based conformer model trained on Arabic, English, and French, effectively managing dialectal variation and code-switching in speech recognition.

Findings

01

Outperforms state-of-the-art monolingual dialectal Arabic ASR

02

Effective in cross-lingual and dialectal code-switching scenarios

03

Demonstrates robustness across multiple languages and dialects

Abstract

With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. We evaluate the system performance handling: (i) monolingual (Ar, En and Fr); (ii) multi-dialectal (Modern Standard Arabic, along with dialectal variation such as Egyptian and Moroccan); (iii) code-switching -- cross-lingual (Ar-En/Fr) and dialectal (MSA-Egyptian dialect) test cases, and compare with current state-of-the-art systems. Furthermore, we investigate the influence of different embedding/character representations including character vs word-piece; shared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.