ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks

Haroun Elleuch; Youssef Saidi; Salima Mdhaffar; Yannick Est\`eve; Fethi Bougares

arXiv:2511.10090·cs.CL·November 14, 2025

ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks

Haroun Elleuch, Youssef Saidi, Salima Mdhaffar, Yannick Est\`eve, Fethi Bougares

PDF

Open Access 2 Models

TL;DR

This paper presents Elyadata and LIA's top-ranking systems for Arabic dialect identification and speech recognition at NADI 2025, leveraging fine-tuned large pre-trained models to achieve high accuracy and low error rates.

Contribution

The paper introduces effective fine-tuning strategies for large pre-trained speech models to improve Arabic dialect identification and speech recognition performance.

Findings

01

ADI system achieved 79.83% accuracy

02

ASR system obtained 38.54% WER

03

Large pre-trained models are effective for Arabic speech tasks

Abstract

This paper describes Elyadata \& LIA's joint submission to the NADI multi-dialectal Arabic Speech Processing 2025. We participated in the Spoken Arabic Dialect Identification (ADI) and multi-dialectal Arabic ASR subtasks. Our submission ranked first for the ADI subtask and second for the multi-dialectal Arabic ASR subtask among all participants. Our ADI system is a fine-tuned Whisper-large-v3 encoder with data augmentation. This system obtained the highest ADI accuracy score of \textbf{79.83\%} on the official test set. For multi-dialectal Arabic ASR, we fine-tuned SeamlessM4T-v2 Large (Egyptian variant) separately for each of the eight considered dialects. Overall, we obtained an average WER and CER of \textbf{38.54\%} and \textbf{14.53\%}, respectively, on the test set. Our results demonstrate the effectiveness of large pre-trained speech models with targeted fine-tuning for Arabic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Authorship Attribution and Profiling