Evaluation of Hate Speech Detection Using Large Language Models and Geographical Contextualization
Anwar Hossain Zahid, Monoshi Kumar Roy, Swarna Das

TL;DR
This study evaluates large language models' ability to detect hate speech across multiple languages and geographic contexts, highlighting their strengths and weaknesses in accuracy, contextual understanding, and robustness.
Contribution
It introduces a new evaluation framework for hate speech detection considering binary classification, geographic context, and adversarial robustness, and assesses three state-of-the-art LLMs on diverse datasets.
Findings
Codellama achieved the highest recall at 70.6%.
DeepSeekCoder showed the best geographic sensitivity.
Llama2 misclassified 62.5% of adversarial samples.
Abstract
The proliferation of hate speech on social media is one of the serious issues that is bringing huge impacts to society: an escalation of violence, discrimination, and social fragmentation. The problem of detecting hate speech is intrinsically multifaceted due to cultural, linguistic, and contextual complexities and adversarial manipulations. In this study, we systematically investigate the performance of LLMs on detecting hate speech across multilingual datasets and diverse geographic contexts. Our work presents a new evaluation framework in three dimensions: binary classification of hate speech, geography-aware contextual detection, and robustness to adversarially generated text. Using a dataset of 1,000 comments from five diverse regions, we evaluate three state-of-the-art LLMs: Llama2 (13b), Codellama (7b), and DeepSeekCoder (6.7b). Codellama had the best binary classification recall…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Spam and Phishing Detection
MethodsSparse Evolutionary Training
