Cross-Language Approach for Quranic QA

Islam Oshallah; Mohamed Basem; Ali Hamdi; Ammar Mohammed

arXiv:2501.17449·cs.CL·January 30, 2025

Cross-Language Approach for Quranic QA

Islam Oshallah, Mohamed Basem, Ali Hamdi, Ammar Mohammed

PDF

Open Access

TL;DR

This paper introduces a cross-language approach combining dataset augmentation and language model fine-tuning to improve Quranic question answering, addressing linguistic disparities and limited data challenges.

Contribution

It proposes a novel cross-language methodology using machine translation and pre-trained models to enhance Quranic QA performance, which is a new approach in this domain.

Findings

01

RoBERTa-Base achieved MAP@10 of 0.34 and MRR of 0.52.

02

DeBERTa-v3-Base excelled in Recall@10 and Precision@10.

03

The approach significantly improves model performance in Quranic QA.

Abstract

Question answering systems face critical limitations in languages with limited resources and scarce data, making the development of robust models especially challenging. The Quranic QA system holds significant importance as it facilitates a deeper understanding of the Quran, a Holy text for over a billion people worldwide. However, these systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic, and the small size of existing datasets, which further restricts model performance. To address these challenges, we adopt a cross-language approach by (1) Dataset Augmentation: expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducation and Islamic Studies

MethodsADaptive gradient method with the OPTimal convergence rate · ALIGN · Flan-T5