A Benchmark Arabic Dataset for Commonsense Explanation

Saja AL-Tawalbeh; Mohammad AL-Smadi

arXiv:2012.10251·cs.CL·December 21, 2020·1 cites

A Benchmark Arabic Dataset for Commonsense Explanation

Saja AL-Tawalbeh, Mohammad AL-Smadi

PDF

Open Access

TL;DR

This paper introduces a new benchmark dataset for Arabic commonsense explanation, providing a resource to evaluate and improve machine understanding of Arabic language and reasoning.

Contribution

It presents the first benchmark Arabic dataset for commonsense explanation, including baseline results to facilitate future research in this area.

Findings

01

Dataset includes Arabic sentences with false meaning and explanations

02

Baseline models provide initial performance metrics

03

Dataset is publicly available for research use

Abstract

Language comprehension and commonsense knowledge validation by machines are challenging tasks that are still under researched and evaluated for Arabic text. In this paper, we present a benchmark Arabic dataset for commonsense explanation. The dataset consists of Arabic sentences that does not make sense along with three choices to select among them the one that explains why the sentence is false. Furthermore, this paper presents baseline results to assist and encourage the future evaluation of research in this field. The dataset is distributed under the Creative Commons CC-BY-SA 4.0 license and can be found on GitHub

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)