Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction
Mohamed Basem, Islam Oshallah, Ali Hamdi, Khaled Shaban, and Hozaifa Kassab

TL;DR
This paper introduces a two-stage Quranic question answering framework that combines ensemble retrieval with instruction-tuned answer extraction, achieving state-of-the-art results on a specialized dataset.
Contribution
It presents a novel approach integrating ensemble retrieval and instruction-tuned models for improved Quranic QA performance, addressing low-resource challenges.
Findings
Achieved MAP@10 of 0.3128 and MRR@10 of 0.5763 for retrieval.
Attained pAP@10 of 0.669 for answer extraction.
Outperformed previous methods significantly.
Abstract
Quranic Question Answering presents unique challenges due to the linguistic complexity of Classical Arabic and the semantic richness of religious texts. In this paper, we propose a novel two-stage framework that addresses both passage retrieval and answer extraction. For passage retrieval, we ensemble fine-tuned Arabic language models to achieve superior ranking performance. For answer extraction, we employ instruction-tuned large language models with few-shot prompting to overcome the limitations of fine-tuning on small datasets. Our approach achieves state-of-the-art results on the Quran QA 2023 Shared Task, with a MAP@10 of 0.3128 and MRR@10 of 0.5763 for retrieval, and a pAP@10 of 0.669 for extraction, substantially outperforming previous methods. These results demonstrate that combining model ensembling and instruction-tuned language models effectively addresses the challenges of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
