CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks
Zhaozhi Qian, Faroq Altam, Muhammad Alqurishi, Riad Souissi

TL;DR
This paper introduces Juhaina, a culturally aligned Arabic-English LLM with 9.24 billion parameters, and proposes CamelEval, a new benchmark to evaluate Arabic LLMs, demonstrating superior performance over existing models.
Contribution
The paper presents Juhaina, a new bilingual Arabic-English LLM tailored to cultural values, and introduces CamelEval, a benchmark for evaluating Arabic language models.
Findings
Juhaina outperforms Llama and Gemma in Arabic response quality.
Juhaina provides more accurate regional information and cultural understanding.
CamelEval effectively evaluates Arabic LLMs' cultural and linguistic capabilities.
Abstract
Large Language Models (LLMs) are the cornerstones of modern artificial intelligence systems. This paper introduces Juhaina, a Arabic-English bilingual LLM specifically designed to align with the values and preferences of Arabic speakers. Juhaina inherently supports advanced functionalities such as instruction following, open-ended question answering, information provisioning, and text processing. Our model contains 9.24 billion parameters and is trained on a context window of up to 8,192 tokens. This paper details the creation process of Juhaina and provides an extensive empirical evaluation. Furthermore, we identify the limitations of widely-adopted Open Arabic LLM Leaderboard (OALL) and propose a new evaluation benchmark, CamelEval. Our findings demonstrate that Juhaina surpasses existing LLMs of comparable sizes, such as the Llama and Gemma families, in generating helpful responses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage, Linguistics, Cultural Analysis · African history and culture analysis · Media, Religion, Digital Communication
MethodsLLaMA · ALIGN
