Arabic Little STT: Arabic Children Speech Recognition Dataset
Mouhand Alkadri, Dania Desouki, Khloud Al Jallad

TL;DR
This paper introduces Arabic Little STT, a new dataset of Levantine Arabic children's speech, and evaluates the performance of state-of-the-art ASR models, revealing significant challenges and the need for dedicated child speech data.
Contribution
The creation of the first Levantine Arabic child speech dataset and systematic assessment of Whisper ASR models on this data.
Findings
Whisper models perform poorly on child speech with high WER.
Significant performance gap between child and adult speech recognition.
Highlighting the need for child-specific speech datasets in Arabic.
Abstract
The performance of Artificial Intelligence (AI) systems fundamentally depends on high-quality training data. However, low-resource languages like Arabic suffer from severe data scarcity. Moreover, the absence of child-specific speech corpora is an essential gap that poses significant challenges. To address this gap, we present our created dataset, Arabic Little STT, a dataset of Levantine Arabic child speech recorded in classrooms, containing 355 utterances from 288 children (ages 6 - 13). We further conduct a systematic assessment of Whisper, a state-of-the-art automatic speech recognition (ASR) model, on this dataset and compare its performance with adult Arabic benchmarks. Our evaluation across eight Whisper variants reveals that even the best-performing model (Large_v3) struggles significantly, achieving a 0.66 word error rate (WER) on child speech, starkly contrasting with its sub…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
