ChildGuard: A Specialized Dataset for Combatting Child-Targeted Hate Speech
Gautam Siddharth Kashyap, Mohammad Anas Azeez, Rafiq Ali, Zohaib Hasan Siddiqui, Jiechao Gao, and Usman Naseem

TL;DR
ChildGuard is a large, annotated dataset specifically designed to improve detection of hate speech targeting children across social media platforms, addressing limitations of previous datasets focused on adults.
Contribution
The paper introduces ChildGuard, the first extensive dataset for child-targeted hate speech, with detailed age-specific labels and analysis of model performance challenges.
Findings
State-of-the-art models perform poorly on ChildGuard
Dataset includes 351,877 examples across multiple platforms
Two subsets enable nuanced linguistic and contextual analysis
Abstract
Hate speech targeting children on social media is a serious and growing problem, yet current NLP systems struggle to detect it effectively. This gap exists mainly because existing datasets focus on adults, lack age specific labels, miss nuanced linguistic cues, and are often too small for robust modeling. To address this, we introduce ChildGuard, the first large scale English dataset dedicated to hate speech aimed at children. It contains 351,877 annotated examples from X (formerly Twitter), Reddit, and YouTube, labeled by three age groups: younger children (under 11), pre teens (11--12), and teens (13--17). The dataset is split into two subsets for fine grained analysis: a contextual subset (157K) focusing on discourse level features, and a lexical subset (194K) emphasizing word-level sentiment and vocabulary. Benchmarking state of the art hate speech models on ChildGuard reveals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Bullying, Victimization, and Aggression · Spam and Phishing Detection
