Quranic Conversations: Developing a Semantic Search tool for the Quran using Arabic NLP Techniques
Yasser Shohoud, Maged Shoman, Sarah Abdelazim

TL;DR
This paper presents a semantic search tool for the Quran that leverages Arabic NLP techniques and models trained on tafsir datasets to accurately retrieve relevant verses based on user inquiries.
Contribution
It introduces a novel semantic search method for the Quran using cosine similarity on models trained with extensive tafsir data, improving retrieval accuracy.
Findings
Achieved cosine similarity scores up to 0.97 for relevant verses
Developed a model trained on over 30 tafsirs for effective semantic matching
Enhanced Quranic verse retrieval for user inquiries
Abstract
The Holy Book of Quran is believed to be the literal word of God (Allah) as revealed to the Prophet Muhammad (PBUH) over a period of approximately 23 years. It is the book where God provides guidance on how to live a righteous and just life, emphasizing principles like honesty, compassion, charity and justice, as well as providing rules for personal conduct, family matters, business ethics and much more. However, due to constraints related to the language and the Quran organization, it is challenging for Muslims to get all relevant ayahs (verses) pertaining to a matter or inquiry of interest. Hence, we developed a Quran semantic search tool which finds the verses pertaining to the user inquiry or prompt. To achieve this, we trained several models on a large dataset of over 30 tafsirs, where typically each tafsir corresponds to one verse in the Quran and, using cosine similarity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining
MethodsVERtex Similarity Embeddings
