How Well Do LLMs Understand Tunisian Arabic?

Mohamed Mahdi

arXiv:2511.16683·cs.CL·November 24, 2025

How Well Do LLMs Understand Tunisian Arabic?

Mohamed Mahdi

PDF

Open Access

TL;DR

This paper evaluates how well large language models understand Tunisian Arabic, highlighting gaps and emphasizing the need for inclusive AI that supports low-resource languages.

Contribution

Introduces a new dataset with Tunisian Arabic and English, benchmarking LLMs on transliteration, translation, and sentiment analysis tasks.

Findings

01

Significant variation in model performance across tasks

02

Identified limitations in LLM understanding of Tunisian dialects

03

Highlighted the importance of supporting low-resource languages in AI

Abstract

Large Language Models (LLMs) are the engines driving today's AI agents. The better these models understand human languages, the more natural and user-friendly the interaction with AI becomes, from everyday devices like computers and smartwatches to any tool that can act intelligently. Yet, the ability of industrial-scale LLMs to comprehend low-resource languages, such as Tunisian Arabic (Tunizi), is often overlooked. This neglect risks excluding millions of Tunisians from fully interacting with AI in their own language, pushing them toward French or English. Such a shift not only threatens the preservation of the Tunisian dialect but may also create challenges for literacy and influence younger generations to favor foreign languages. In this study, we introduce a novel dataset containing parallel Tunizi, standard Tunisian Arabic, and English translations, along with sentiment labels. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · ICT in Developing Communities · Big Data and Digital Economy