Quran-MD: A Fine-Grained Multilingual Multimodal Dataset of the Quran

Muhammad Umar Salman; Mohammad Areeb Qazi; Mohammed Talha Alam

arXiv:2601.17880·cs.CV·January 27, 2026

Quran-MD: A Fine-Grained Multilingual Multimodal Dataset of the Quran

Muhammad Umar Salman, Mohammad Areeb Qazi, Mohammed Talha Alam

PDF

Open Access 5 Datasets

TL;DR

Quran-MD is a detailed multilingual multimodal dataset combining text, audio, and linguistic data of the Quran at verse and word levels, supporting advanced NLP and speech applications.

Contribution

It introduces a comprehensive, fine-grained dataset with diverse recitation styles, enabling new research in Quranic recitation, linguistic analysis, and multimodal AI applications.

Findings

01

Supports tasks like ASR, TTS, and tajweed detection

02

Provides diverse recitation audio from 32 reciters

03

Facilitates multimodal embeddings and semantic retrieval

Abstract

We present Quran MD, a comprehensive multimodal dataset of the Quran that integrates textual, linguistic, and audio dimensions at the verse and word levels. For each verse (ayah), the dataset provides its original Arabic text, English translation, and phonetic transliteration. To capture the rich oral tradition of Quranic recitation, we include verse-level audio from 32 distinct reciters, reflecting diverse recitation styles and dialectical nuances. At the word level, each token is paired with its corresponding Arabic script, English translation, transliteration, and an aligned audio recording, allowing fine-grained analysis of pronunciation, phonology, and semantic context. This dataset supports various applications, including natural language processing, speech recognition, text-to-speech synthesis, linguistic analysis, and digital Islamic studies. Bridging text and audio modalities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Text and Document Classification Technologies