Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety
Elizaveta Korotkova, Isaac Chung

TL;DR
This paper highlights that existing toxicity detection datasets are insufficient for brand safety applications, emphasizing the need for specialized datasets and analyzing sampling strategies for better content moderation.
Contribution
It introduces the importance of creating brand safety-specific datasets and evaluates the impact of weighted sampling in toxicity-related text classification.
Findings
Common toxicity datasets are inadequate for brand safety tasks.
Weighted sampling strategies influence classification performance.
Brand safety requires tailored datasets beyond toxicity detection.
Abstract
The rapid growth in user generated content on social media has resulted in a significant rise in demand for automated content moderation. Various methods and frameworks have been proposed for the tasks of hate speech detection and toxic comment classification. In this work, we combine common datasets to extend these tasks to brand safety. Brand safety aims to protect commercial branding by identifying contexts where advertisements should not appear and covers not only toxicity, but also other potentially harmful content. As these datasets contain different label sets, we approach the overall problem as a binary classification task. We demonstrate the need for building brand safety specific datasets via the application of common toxicity detection datasets to a subset of brand safety and empirically analyze the effects of weighted sampling strategies in text classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Software Engineering Research
