Evaluation of Semantic Search and its Role in   Retrieved-Augmented-Generation (RAG) for Arabic Language

Ali Mahboub; Muhy Eddin Za'ter; Bashar Al-Rfooh; Yazan Estaitia; Adnan; Jaljuli; Asma Hakouz

arXiv:2403.18350·cs.CL·May 31, 2024·1 cites

Evaluation of Semantic Search and its Role in Retrieved-Augmented-Generation (RAG) for Arabic Language

Ali Mahboub, Muhy Eddin Za'ter, Bashar Al-Rfooh, Yazan Estaitia, Adnan, Jaljuli, Asma Hakouz

PDF

Open Access

TL;DR

This paper introduces a new benchmark for evaluating semantic search in Arabic and assesses its effectiveness within retrieval-augmented generation (RAG) frameworks, addressing challenges unique to the language.

Contribution

It establishes a straightforward benchmark for Arabic semantic search and evaluates its performance in RAG, filling a gap due to lack of standard benchmarks for Arabic.

Findings

01

Proposed a new benchmark dataset for Arabic semantic search

02

Evaluated semantic search metrics within RAG framework

03

Identified challenges specific to Arabic language in semantic search

Abstract

The latest advancements in machine learning and deep learning have brought forth the concept of semantic similarity, which has proven immensely beneficial in multiple applications and has largely replaced keyword search. However, evaluating semantic similarity and conducting searches for a specific query across various documents continue to be a complicated task. This complexity is due to the multifaceted nature of the task, the lack of standard benchmarks, whereas these challenges are further amplified for Arabic language. This paper endeavors to establish a straightforward yet potent benchmark for semantic search in Arabic. Moreover, to precisely evaluate the effectiveness of these metrics and the dataset, we conduct our assessment of semantic search within the framework of retrieval augmented generation (RAG).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems