Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Mahmoud Salhab; Shameed Sait; Mohammad Abusheikh; Hasan Abusheikh

arXiv:2508.08912·cs.CL·August 13, 2025

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Mahmoud Salhab, Shameed Sait, Mohammad Abusheikh, Hasan Abusheikh

PDF

Open Access 1 Video

TL;DR

This paper introduces a scalable training pipeline that combines weakly supervised pretraining and supervised fine-tuning to develop a high-performing, multi-dialectal Arabic speech recognition system, addressing low-resource challenges.

Contribution

It presents a novel approach that leverages large-scale weakly labeled data and continual fine-tuning to improve Arabic ASR across multiple dialects, achieving state-of-the-art results.

Findings

01

Achieved first place in the NADI 2025 Shared Task 2 for multi-dialectal Arabic ASR.

02

Demonstrated effectiveness of weak supervision combined with fine-tuning for low-resource languages.

03

Produced a robust Arabic ASR model capable of handling diverse dialects.

Abstract

Automatic speech recognition (ASR) plays a vital role in enabling natural human-machine interaction across applications such as virtual assistants, industrial automation, customer support, and real-time transcription. However, developing accurate ASR systems for low-resource languages like Arabic remains a significant challenge due to limited labeled data and the linguistic complexity introduced by diverse dialects. In this work, we present a scalable training pipeline that combines weakly supervised learning with supervised fine-tuning to develop a robust Arabic ASR model. In the first stage, we pretrain the model on 15,000 hours of weakly labeled speech covering both Modern Standard Arabic (MSA) and various Dialectal Arabic (DA) variants. In the subsequent stage, we perform continual supervised fine-tuning using a mixture of filtered weakly labeled data and a small, high-quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning· underline

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · ICT in Developing Communities