A Comparative Benchmark of a Moroccan Darija Toxicity Detection Model   (Typica.ai) and Major LLM-Based Moderation APIs (OpenAI, Mistral, Anthropic)

Hicham Assoudi

arXiv:2505.04640·cs.CL·May 9, 2025

A Comparative Benchmark of a Moroccan Darija Toxicity Detection Model (Typica.ai) and Major LLM-Based Moderation APIs (OpenAI, Mistral, Anthropic)

Hicham Assoudi

PDF

Open Access 1 Repo

TL;DR

This study benchmarks Moroccan Darija toxicity detection models, showing Typica.ai's superior performance over major LLM APIs in culturally nuanced content moderation.

Contribution

It provides a comparative evaluation of a culturally adapted toxicity detection model against major LLM moderation APIs for Moroccan Darija.

Findings

01

Typica.ai outperforms major LLM APIs in toxicity detection.

02

Culturally grounded models are more effective for Moroccan Darija content.

03

Challenges remain in detecting implicit and culturally specific toxicity.

Abstract

This paper presents a comparative benchmark evaluating the performance of Typica.ai's custom Moroccan Darija toxicity detection model against major LLM-based moderation APIs: OpenAI (omni-moderation-latest), Mistral (mistral-moderation-latest), and Anthropic Claude (claude-3-haiku-20240307). We focus on culturally grounded toxic content, including implicit insults, sarcasm, and culturally specific aggression often overlooked by general-purpose systems. Using a balanced test set derived from the OMCD_Typica.ai_Mix dataset, we report precision, recall, F1-score, and accuracy, offering insights into challenges and opportunities for moderation in underrepresented languages. Our results highlight Typica.ai's superior performance, underlining the importance of culturally adapted models for reliable content moderation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

assoudi-typica-ai/darija-toxicity-benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training · Focus