Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education

Abdul Aziz Snoubara; Baraa Al_Maradni; Haya Al_Naal; Malek Al_Madrmani; Roaa Jdini; Seedra Zarzour; Khloud Al Jallad

arXiv:2603.20255·cs.CL·March 24, 2026

Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education

Abdul Aziz Snoubara, Baraa Al_Maradni, Haya Al_Naal, Malek Al_Madrmani, Roaa Jdini, Seedra Zarzour, Khloud Al Jallad

PDF

Open Access

TL;DR

This paper introduces Abjad-Kids, a comprehensive Arabic speech dataset for primary education, and proposes a hierarchical CNN-LSTM classification approach to recognize children's speech for educational applications.

Contribution

The creation of the Abjad-Kids dataset and the development of a hierarchical CNN-LSTM classification method tailored for Arabic children's speech recognition.

Findings

01

Static linguistic grouping outperforms dynamic clustering.

02

CNN-LSTM models with data augmentation improve classification accuracy.

03

Overfitting remains a challenge due to limited data.

Abstract

Speech-based AI educational applications have gained significant interest in recent years, particularly for children. However, children speech research remains limited due to the lack of publicly available datasets, especially for low-resource languages such as Arabic.This paper presents Abjad-Kids, an Arabic speech dataset designed for kindergarten and primary education, focusing on fundamental learning of alphabets, numbers, and colors. The dataset consists of 46397 audio samples collected from children aged 3 - 12 years, covering 141 classes. All samples were recorded under controlled specifications to ensure consistency in duration, sampling rate, and format. To address high intra-class similarity among Arabic phonemes and the limited samples per class, we propose a hierarchical audio classification based on CNN-LSTM architectures. Our proposed methodology decomposes alphabet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Machine Learning and Data Classification · Music and Audio Processing