Old wine in old glasses: Comparing computational and qualitative methods in identifying incivility on Persian Twitter during the #MahsaAmini movement
Hossein Kermani, Fatemeh Oudlajani, Pardis Yarahmadi, Hamideh Mahdi Soltani, Mohammad Makki, Zahra HosseiniKhoo

TL;DR
This study compares human, supervised machine learning, and large language model methods for detecting incivility in Persian tweets from the #MahsaAmini movement, highlighting their relative accuracy and limitations.
Contribution
It provides a comprehensive comparison of qualitative and computational approaches for hate speech detection in Persian, emphasizing the strengths and weaknesses of each method.
Findings
ParsBERT outperforms ChatGPT models in accuracy
ChatGPT struggles with subtle and explicit incivility
Prompt language does not significantly affect ChatGPT outputs
Abstract
This paper compares three approaches to detecting incivility in Persian tweets: human qualitative coding, supervised learning with ParsBERT, and large language models (ChatGPT). Using 47,278 tweets from the #MahsaAmini movement in Iran, we evaluate the accuracy and efficiency of each method. ParsBERT substantially outperforms seven evaluated ChatGPT models in identifying hate speech. We also find that ChatGPT struggles not only with subtle cases but also with explicitly uncivil content, and that prompt language (English vs. Persian) does not meaningfully affect its outputs. The study provides a detailed comparison of these approaches and clarifies their strengths and limitations for analyzing hate speech in a low-resource language context.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Sentiment Analysis and Opinion Mining
