BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection
Ali Zain, Sareem Farooqui, Muhammad Rafi

TL;DR
This study compares transformer-based models for detecting Arabic AI-generated text, revealing that multilingual models like XLM-RoBERTa outperform specialized Arabic models in accuracy.
Contribution
It demonstrates the effectiveness of multilingual transformer models over specialized Arabic models in AI-generated text detection tasks.
Findings
XLM-RoBERTa achieved the highest F1 score of 0.7701.
Multilingual models outperform Arabic-specific models.
The work highlights the potential of generalist models in language-specific tasks.
Abstract
This paper details our submission to the AraGenEval Shared Task on Arabic AI-generated text detection, where our team, BUSTED, secured 5th place. We investigated the effectiveness of three pre-trained transformer models: AraELECTRA, CAMeLBERT, and XLM-RoBERTa. Our approach involved fine-tuning each model on the provided dataset for a binary classification task. Our findings revealed a surprising result: the multilingual XLM-RoBERTa model achieved the highest performance with an F1 score of 0.7701, outperforming the specialized Arabic models. This work underscores the complexities of AI-generated text detection and highlights the strong generalization capabilities of multilingual models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
