Arabic Dialect Classification using RNNs, Transformers, and Large Language Models: A Comparative Analysis

Omar A.Essameldin; Ali O.Elbeih; Wael H.Gomaa; Wael F.Elsersy

arXiv:2506.19753·cs.CL·July 1, 2025

Arabic Dialect Classification using RNNs, Transformers, and Large Language Models: A Comparative Analysis

Omar A.Essameldin, Ali O.Elbeih, Wael H.Gomaa, Wael F.Elsersy

PDF

Open Access 2 Models

TL;DR

This paper compares RNNs, Transformers, and large language models for classifying 18 Arabic dialects, highlighting MARBERTv2's superior performance and discussing applications in chatbots and social media analysis.

Contribution

It provides a comprehensive comparison of NLP models for Arabic dialect classification, introducing effective preprocessing and prompt engineering techniques.

Findings

01

MARBERTv2 achieved 65% accuracy and 64% F1-score.

02

State-of-the-art NLP models improve dialect identification.

03

Applications include chatbots and social media monitoring.

Abstract

The Arabic language is among the most popular languages in the world with a huge variety of dialects spoken in 22 countries. In this study, we address the problem of classifying 18 Arabic dialects of the QADI dataset of Arabic tweets. RNN models, Transformer models, and large language models (LLMs) via prompt engineering are created and tested. Among these, MARBERTv2 performed best with 65% accuracy and 64% F1-score. Through the use of state-of-the-art preprocessing techniques and the latest NLP models, this paper identifies the most significant linguistic issues in Arabic dialect identification. The results corroborate applications like personalized chatbots that respond in users' dialects, social media monitoring, and greater accessibility for Arabic communities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Linguistic Variation and Morphology · Natural Language Processing Techniques