ARCOQ: Arabic Closest Opposite Questions Dataset
Sandra Rizkallah, Amir F. Atiya, and Samir Shaheen

TL;DR
This paper introduces ARCOQ, the first Arabic dataset for closest opposite questions, enabling evaluation of antonymy detection systems in Arabic with benchmark results for various word embeddings.
Contribution
It provides the first Arabic closest opposite questions dataset, along with standard splits and benchmark results for different Arabic word embedding models.
Findings
Dataset contains 500 questions with correct answers.
Benchmark results show varying performance of Arabic word embeddings.
Dataset is publicly available for research use.
Abstract
This paper presents a dataset for closest opposite questions in Arabic language. The dataset is the first of its kind for the Arabic language. It is beneficial for the assessment of systems on the aspect of antonymy detection. The structure is similar to that of the Graduate Record Examination (GRE) closest opposite questions dataset for the English language. The introduced dataset consists of 500 questions, each contains a query word for which the closest opposite needs to be determined from among a set of candidate words. Each question is also associated with the correct answer. We publish the dataset publicly in addition to providing standard splits of the dataset into development and test sets. Moreover, the paper provides a benchmark for the performance of different Arabic word embedding models on the introduced dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsSparse Evolutionary Training
