Modeling offensive content detection for TikTok
Kasper Cools, Gideon Mailette de Buy Wenniger, Clara Maathuis

TL;DR
This paper presents a comprehensive approach to detecting offensive content on TikTok by collecting a large dataset of comments and developing machine learning models, achieving high accuracy in identifying offensive comments.
Contribution
It introduces a new dataset of TikTok comments and develops machine learning models specifically tailored for offensive content detection on TikTok.
Findings
F1 score of 0.863 achieved in offensive content detection
Collected over 120,000 TikTok comments for analysis
Demonstrated effectiveness of machine learning models in this context
Abstract
The advent of social media transformed interpersonal communication and information consumption processes. This digital landscape accommodates user intentions, also resulting in an increase of offensive language and harmful behavior. Concurrently, social media platforms collect vast datasets comprising user-generated content and behavioral information. These datasets are instrumental for platforms deploying machine learning and data-driven strategies, facilitating customer insights and countermeasures against social manipulation mechanisms like disinformation and offensive content. Nevertheless, the availability of such datasets, along with the application of various machine learning techniques, to researchers and practitioners, for specific social media platforms regarding particular events, is limited. In particular for TikTok, which offers unique tools for personalized content…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Advanced Malware Detection Techniques
