Synthetic vs. Gold: The Role of LLM Generated Labels and Data in Cyberbullying Detection
Arefeh Kazemi, Sri Balaaji Natarajan Kalaivendan, Joachim Wagner, Hamza Qadeer, Kanishk Verma, Brian Davis

TL;DR
This paper explores using Large Language Models to generate synthetic and labeled data for cyberbullying detection, achieving performance close to models trained on real data, thus offering a scalable and ethical alternative.
Contribution
It introduces a novel approach of leveraging LLMs for synthetic data generation and labeling in cyberbullying detection, reducing reliance on costly and ethically challenging human annotation.
Findings
Synthetic data enables BERT classifiers to reach 75.8% accuracy.
LLM-labeled authentic data achieves 79.1% accuracy.
Performance is close to models trained on fully authentic datasets.
Abstract
Cyberbullying (CB) presents a pressing threat, especially to children, underscoring the urgent need for robust detection systems to ensure online safety. While large-scale datasets on online abuse exist, there remains a significant gap in labeled data that specifically reflects the language and communication styles used by children. The acquisition of such data from vulnerable populations, such as children, is challenging due to ethical, legal and technical barriers. Moreover, the creation of these datasets relies heavily on human annotation, which not only strains resources but also raises significant concerns due to annotators exposure to harmful content. In this paper, we address these challenges by leveraging Large Language Models (LLMs) to generate synthetic data and labels. Our experiments demonstrate that synthetic data enables BERT-based CB classifiers to achieve performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Bullying, Victimization, and Aggression · Authorship Attribution and Profiling
