Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation

Dimosthenis Antypas; Indira Sen; Carla Perez-Almendros; Jose Camacho-Collados; Francesco Barbieri

arXiv:2411.19832·cs.CL·June 25, 2025

Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation

Dimosthenis Antypas, Indira Sen, Carla Perez-Almendros, Jose Camacho-Collados, Francesco Barbieri

PDF

Open Access 4 Models 1 Datasets 1 Video

TL;DR

This paper introduces a comprehensive dataset for detecting six types of sensitive social media content and shows that fine-tuning large language models on this dataset significantly improves detection accuracy over existing models and APIs.

Contribution

The paper presents a new unified dataset for multiple sensitive content categories and demonstrates improved detection performance through fine-tuning large language models.

Findings

01

Fine-tuned LLMs outperform off-the-shelf models by 10-15%.

02

Existing moderation APIs underperform on sensitive content detection.

03

The dataset covers six diverse sensitive categories.

Abstract

The detection of sensitive content in large datasets is crucial for ensuring that shared and analysed data is free from harmful material. However, current moderation tools, such as external APIs, suffer from limitations in customisation, accuracy across diverse sensitive categories, and privacy concerns. Additionally, existing datasets and open-source models focus predominantly on toxic language, leaving gaps in detecting other sensitive categories such as substance abuse or self-harm. In this paper, we put forward a unified dataset tailored for social media content moderation across six sensitive categories: conflictual language, profanity, sexually explicit material, drug-related content, self-harm, and spam. By collecting and annotating data with consistent retrieval strategies and guidelines, we address the shortcomings of previous focalised research. Our analysis demonstrates that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

cardiffnlp/x_sensitive
dataset· 57 dl
57 dl

Videos

Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation· underline

Taxonomy

TopicsSentiment Analysis and Opinion Mining

MethodsLLaMA · Focus