# Tackling toxicity in Arabic social media through advanced detection techniques

**Authors:** Loay Hatem, Ahmed Omar, Abdelmgeid A. Ali, Heba Mamdouh Farghaly

PMC · DOI: 10.1038/s41598-025-25879-4 · Scientific Reports · 2025-11-21

## TL;DR

This paper introduces a new Arabic dataset for detecting toxic content on social media and shows that a fine-tuned model achieves high accuracy.

## Contribution

The paper presents a new annotated Arabic dataset and demonstrates a high-performance model for toxicity detection in Arabic.

## Key findings

- The fine-tuned MARBERTv2 model with BERT embedding achieved an F1-score of 92.43%.
- The proposed Arabic dataset was annotated by native and fluent Arabic speakers and linguists.
- The study highlights the importance of addressing toxicity in diverse languages like Arabic.

## Abstract

Online social networks are currently the most widely utilized interactive media for interpersonal communication, emotional expression, and information sharing. Despite the helpful and fascinating content, unfortunately, inappropriate or abusive content, such as toxicity, hate speech, and insults, can occasionally be shared on social networks. Any kind of online abuse, including but not limited to cyberbullying, discrimination, abusive language, profanity, flames, hate speech, and harassment, is considered toxic content. While there has been little effort in the Arabic language, the majority of toxicity detection attempts have focused on English text. In this work, we constructed a standard Arabic dataset that can be used for toxicity and abuse detection on OSNs. The proposed dataset has been annotated by the experts of five native and fluent Arabic speakers and linguists. To evaluate the performance of our dataset, we conducted a series of experiments by using sixteen machine learning algorithms, the FastText model, and seven transfer learning architectures to compare the performance. Furthermore, we used four word embedding techniques (bag of words (BOW), term frequency–inverse document frequency (TF-IDF), FASTTEXT, and bidirectional encoder representations from transformers (BERT)). Our experimental results demonstrated that the fine-tuned MARBERTv2 model with BERT embedding outperforms the other models, achieving an F1-score of 92.43% and an accuracy of 92.21%. Notably, this study highlights the importance of addressing toxicity on social media platforms, considering diverse languages and cultures. This signifies a significant breakthrough in the classification of toxic tweets in Arabic.

## Full-text entities

- **Diseases:** toxicity (MESH:D064420), abuse (MESH:D019966)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12638896/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12638896/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12638896/full.md

---
Source: https://tomesphere.com/paper/PMC12638896