Empirical Evaluation of Public HateSpeech Datasets
Sadar Jaf, Basel Barakat

TL;DR
This paper empirically evaluates public hate speech datasets, revealing their limitations and providing insights to improve the training of more accurate hate speech detection models.
Contribution
It offers a comprehensive analysis of existing datasets, highlighting their weaknesses and guiding future improvements for hate speech classification.
Findings
Current datasets have significant limitations affecting model accuracy
Statistical analyses reveal specific dataset weaknesses
Recommendations for developing better hate speech datasets
Abstract
Despite the extensive communication benefits offered by social media platforms, numerous challenges must be addressed to ensure user safety. One of the most significant risks faced by users on these platforms is targeted hate speech. Social media platforms are widely utilised for generating datasets employed in training and evaluating machine learning algorithms for hate speech detection. However, existing public datasets exhibit numerous limitations, hindering the effective training of these algorithms and leading to inaccurate hate speech classification. This study provides a comprehensive empirical evaluation of several public datasets commonly used in automated hate speech classification. Through rigorous analysis, we present compelling evidence highlighting the limitations of current hate speech datasets. Additionally, we conduct a range of statistical analyses to elucidate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
