Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models
Sarthak Mahajan, Nimmi Rangaswamy

TL;DR
This paper explores the use of open-source and proprietary Large Language Models for classifying extreme speech online, demonstrating that fine-tuning improves their effectiveness in capturing nuanced socio-cultural contexts.
Contribution
It provides a comparative analysis of open-source Llama models and closed-source OpenAI models for extreme speech classification, emphasizing the impact of domain-specific fine-tuning.
Findings
Fine-tuning significantly improves LLM performance on extreme speech classification.
GPT models outperform Llama models in zero-shot settings, but this gap closes after fine-tuning.
Domain-specific fine-tuning enhances the models' ability to interpret socio-cultural nuances.
Abstract
In recent years, widespread internet adoption and the growth in userbase of various social media platforms have led to an increase in the proliferation of extreme speech online. While traditional language models have demonstrated proficiency in distinguishing between neutral text and non-neutral text (i.e. extreme speech), categorizing the diverse types of extreme speech presents significant challenges. The task of extreme speech classification is particularly nuanced, as it requires a deep understanding of socio-cultural contexts to accurately interpret the intent of the language used by the speaker. Even human annotators often disagree on the appropriate classification of such content, emphasizing the complex and subjective nature of this task. The use of human moderators also presents a scaling issue, necessitating the need for automated systems for extreme speech classification. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
