MuDRiC: Multi-Dialect Reasoning for Arabic Commonsense Validation
Kareem Elozeiri, Mervat Abassy, Preslav Nakov, Yuxia Wang

TL;DR
This paper introduces MuDRiC, the first multi-dialect Arabic commonsense reasoning dataset, and a GCN-based method that improves validation accuracy across diverse Arabic dialects, advancing natural language understanding in Arabic.
Contribution
The paper presents a novel multi-dialect Arabic commonsense dataset and a GCN-based reasoning method, addressing the underexplored dialectal diversity in Arabic NLP.
Findings
Our method outperforms baseline language model fine-tuning.
MuDRiC covers multiple Arabic dialects, filling a key resource gap.
The GCN approach improves semantic relationship modeling.
Abstract
Commonsense validation evaluates whether a sentence aligns with everyday human understanding, a critical capability for developing robust natural language understanding systems. While substantial progress has been made in English, the task remains underexplored in Arabic, particularly given its rich linguistic diversity. Existing Arabic resources have primarily focused on Modern Standard Arabic (MSA), leaving regional dialects underrepresented despite their prevalence in spoken contexts. To bridge this gap, we present two key contributions. We introduce MuDRiC, an extended Arabic commonsense dataset incorporating multiple dialects. To the best of our knowledge, this is the first Arabic multi-dialect commonsense reasoning dataset. We further propose a novel method adapting Graph Convolutional Networks (GCNs) to Arabic commonsense reasoning, which enhances semantic relationship modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
