LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring

May Bashendy; Walid Massoud; Sohaila Eltanbouly; Salam Albatarni; Marwan Sayed; Abrar Abir; Houda Bouamor; and Tamer Elsayed

arXiv:2512.24235·cs.CL·January 27, 2026

LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring

May Bashendy, Walid Massoud, Sohaila Eltanbouly, Salam Albatarni, Marwan Sayed, Abrar Abir, Houda Bouamor, and Tamer Elsayed

PDF

Open Access 1 Video

TL;DR

LAILA is the largest publicly available Arabic AES dataset, enabling improved automated scoring by providing comprehensive annotations across multiple writing traits and benchmarking state-of-the-art models.

Contribution

This paper introduces LAILA, the first large-scale Arabic AES dataset with detailed trait annotations, facilitating advanced research and model development.

Findings

01

State-of-the-art models achieve promising results on LAILA

02

Cross-prompt performance indicates the dataset's robustness

03

Trait-specific scoring enhances AES accuracy

Abstract

Automated Essay Scoring (AES) has gained increasing attention in recent years, yet research on Arabic AES remains limited due to the lack of publicly available datasets. To address this, we introduce LAILA, the largest publicly available Arabic AES dataset to date, comprising 7,859 essays annotated with holistic and trait-specific scores on seven dimensions: relevance, organization, vocabulary, style, development, mechanics, and grammar. We detail the dataset design, collection, and annotations, and provide benchmark results using state-of-the-art Arabic and English models in prompt-specific and cross-prompt settings. LAILA fills a critical need in Arabic AES research, supporting the development of robust scoring systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies