"Is Hate Lost in Translation?": Evaluation of Multilingual LGBTQIA+ Hate Speech Detection
Fai Leui Chan, Duke Nguyen, Aditya Joshi

TL;DR
This study evaluates how well multilingual large language models detect LGBTQIA+ hate speech across different languages and translation scenarios, revealing language-specific challenges and the effects of fine-tuning and translation on detection accuracy.
Contribution
It provides a comprehensive analysis of hate speech detection in multilingual contexts, highlighting the impact of translation and fine-tuning on model performance across diverse languages.
Findings
English language detection performs best.
Code-switching detection is the most challenging.
Fine-tuning consistently improves detection accuracy.
Abstract
This paper explores the challenges of detecting LGBTQIA+ hate speech of large language models across multiple languages, including English, Italian, Chinese and (code-switched) English-Tamil, examining the impact of machine translation and whether the nuances of hate speech are preserved across translation. We examine the hate speech detection ability of zero-shot and fine-tuned GPT. Our findings indicate that: (1) English has the highest performance and the code-switching scenario of English-Tamil being the lowest, (2) fine-tuning improves performance consistently across languages whilst translation yields mixed results. Through simple experimentation with original text and machine-translated text for hate speech detection along with a qualitative error analysis, this paper sheds light on the socio-cultural nuances and complexities of languages that may not be captured by automatic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Multi-Head Attention · Dense Connections · Residual Connection · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing
