SLM-Mod: Small Language Models Surpass LLMs at Content Moderation

Xianyang Zhan; Agam Goyal; Yilun Chen; Eshwar Chandrasekharan; Koustuv; Saha

arXiv:2410.13155·cs.CL·February 11, 2025·2 cites

SLM-Mod: Small Language Models Surpass LLMs at Content Moderation

Xianyang Zhan, Agam Goyal, Yilun Chen, Eshwar Chandrasekharan, Koustuv, Saha

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that small open-source language models, when fine-tuned, can outperform larger models in community-specific content moderation tasks, offering a cost-effective and adaptable alternative.

Contribution

The study shows that small language models (<15B parameters) can surpass larger models in content moderation accuracy and recall, especially in community-specific contexts.

Findings

01

SLMs outperform zero-shot LLMs in accuracy and recall

02

Few-shot LLMs show marginal performance gains

03

Cross-community moderation is promising for new platforms

Abstract

Large language models (LLMs) have shown promise in many natural language understanding tasks, including content moderation. However, these models can be expensive to query in real-time and do not allow for a community-specific approach to content moderation. To address these challenges, we explore the use of open-source small language models (SLMs) for community-specific content moderation tasks. We fine-tune and evaluate SLMs (less than 15B parameters) by comparing their performance against much larger open- and closed-sourced models in both a zero-shot and few-shot setting. Using 150K comments from 15 popular Reddit communities, we find that SLMs outperform zero-shot LLMs at content moderation -- 11.5% higher accuracy and 25.7% higher recall on average across all communities. Moreover, few-shot in-context learning leads to only a marginal increase in the performance of LLMs, still…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SLM-Mod: Small Language Models Surpass LLMs at Content Moderation· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection