ARCOQ: Arabic Closest Opposite Questions Dataset

Sandra Rizkallah; Amir F. Atiya; and Samir Shaheen

arXiv:2310.14384·cs.CL·October 24, 2023·1 cites

ARCOQ: Arabic Closest Opposite Questions Dataset

Sandra Rizkallah, Amir F. Atiya, and Samir Shaheen

PDF

Open Access 1 Repo

TL;DR

This paper introduces ARCOQ, the first Arabic dataset for closest opposite questions, enabling evaluation of antonymy detection systems in Arabic with benchmark results for various word embeddings.

Contribution

It provides the first Arabic closest opposite questions dataset, along with standard splits and benchmark results for different Arabic word embedding models.

Findings

01

Dataset contains 500 questions with correct answers.

02

Benchmark results show varying performance of Arabic word embeddings.

03

Dataset is publicly available for research use.

Abstract

This paper presents a dataset for closest opposite questions in Arabic language. The dataset is the first of its kind for the Arabic language. It is beneficial for the assessment of systems on the aspect of antonymy detection. The structure is similar to that of the Graduate Record Examination (GRE) closest opposite questions dataset for the English language. The introduced dataset consists of 500 questions, each contains a query word for which the closest opposite needs to be determined from among a set of candidate words. Each question is also associated with the correct answer. We publish the dataset publicly in addition to providing standard splits of the dataset into development and test sets. Moreover, the paper provides a benchmark for the performance of different Arabic word embedding models on the introduced dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sandrarizkallah/arcoq-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsSparse Evolutionary Training