Arab Voices: Mapping Standard and Dialectal Arabic Speech Technology

Peter Sullivan; AbdelRahim Elmadany; Alcides Alcoba Inciarte; Muhammad Abdul-Mageed

arXiv:2601.13319·cs.CL·January 30, 2026

Arab Voices: Mapping Standard and Dialectal Arabic Speech Technology

Peter Sullivan, AbdelRahim Elmadany, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed

PDF

Open Access

TL;DR

This paper analyzes the variability in dialectal Arabic speech datasets and introduces Arab Voices, a standardized framework with unified data access and evaluation tools to improve reproducibility and benchmarking in dialectal Arabic speech recognition.

Contribution

It provides a comprehensive analysis of dialectal Arabic datasets and introduces Arab Voices, a unified platform for data access, metadata, and benchmarking of speech recognition systems.

Findings

01

Significant heterogeneity in datasets in terms of acoustic conditions and dialectal signals.

02

Arab Voices offers standardized access to 31 datasets across 14 dialects.

03

Benchmark results establish strong baselines for future research.

Abstract

Dialectal Arabic (DA) speech data vary widely in domain coverage, dialect labeling practices, and recording conditions, complicating cross-dataset comparison and model evaluation. To characterize this landscape, we conduct a computational analysis of linguistic ``dialectness'' alongside objective proxies of audio quality on the training splits of widely used DA corpora. We find substantial heterogeneity both in acoustic conditions and in the strength and consistency of dialectal signals across datasets, underscoring the need for standardized characterization beyond coarse labels. To reduce fragmentation and support reproducible evaluation, we introduce Arab Voices, a standardized framework for DA ASR. Arab Voices provides unified access to 31 datasets spanning 14 dialects, with harmonized metadata and evaluation utilities. We further benchmark a range of recent ASR systems, establishing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLinguistic Variation and Morphology · Phonetics and Phonology Research · Speech Recognition and Synthesis