Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education
Abdul Aziz Snoubara, Baraa Al_Maradni, Haya Al_Naal, Malek Al_Madrmani, Roaa Jdini, Seedra Zarzour, Khloud Al Jallad

TL;DR
This paper introduces Abjad-Kids, a comprehensive Arabic speech dataset for primary education, and proposes a hierarchical CNN-LSTM classification approach to recognize children's speech for educational applications.
Contribution
The creation of the Abjad-Kids dataset and the development of a hierarchical CNN-LSTM classification method tailored for Arabic children's speech recognition.
Findings
Static linguistic grouping outperforms dynamic clustering.
CNN-LSTM models with data augmentation improve classification accuracy.
Overfitting remains a challenge due to limited data.
Abstract
Speech-based AI educational applications have gained significant interest in recent years, particularly for children. However, children speech research remains limited due to the lack of publicly available datasets, especially for low-resource languages such as Arabic.This paper presents Abjad-Kids, an Arabic speech dataset designed for kindergarten and primary education, focusing on fundamental learning of alphabets, numbers, and colors. The dataset consists of 46397 audio samples collected from children aged 3 - 12 years, covering 141 classes. All samples were recorded under controlled specifications to ensure consistency in duration, sampling rate, and format. To address high intra-class similarity among Arabic phonemes and the limited samples per class, we propose a hierarchical audio classification based on CNN-LSTM architectures. Our proposed methodology decomposes alphabet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Machine Learning and Data Classification · Music and Audio Processing
