Modeling offensive content detection for TikTok

Kasper Cools; Gideon Mailette de Buy Wenniger; Clara Maathuis

arXiv:2408.16857·cs.CL·October 21, 2024

Modeling offensive content detection for TikTok

Kasper Cools, Gideon Mailette de Buy Wenniger, Clara Maathuis

PDF

Open Access

TL;DR

This paper presents a comprehensive approach to detecting offensive content on TikTok by collecting a large dataset of comments and developing machine learning models, achieving high accuracy in identifying offensive comments.

Contribution

It introduces a new dataset of TikTok comments and develops machine learning models specifically tailored for offensive content detection on TikTok.

Findings

01

F1 score of 0.863 achieved in offensive content detection

02

Collected over 120,000 TikTok comments for analysis

03

Demonstrated effectiveness of machine learning models in this context

Abstract

The advent of social media transformed interpersonal communication and information consumption processes. This digital landscape accommodates user intentions, also resulting in an increase of offensive language and harmful behavior. Concurrently, social media platforms collect vast datasets comprising user-generated content and behavioral information. These datasets are instrumental for platforms deploying machine learning and data-driven strategies, facilitating customer insights and countermeasures against social manipulation mechanisms like disinformation and offensive content. Nevertheless, the availability of such datasets, along with the application of various machine learning techniques, to researchers and practitioners, for specific social media platforms regarding particular events, is limited. In particular for TikTok, which offers unique tools for personalized content…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Advanced Malware Detection Techniques