Misogynistic Tweet Detection: Modelling CNN with Small Datasets
Md Abul Bashar, Richi Nayak, Nicolas Suzor, Bridget Weir

TL;DR
This paper explores how to effectively train a CNN model with small datasets to detect misogynistic tweets, using domain-specific pre-trained word vectors and regularisation techniques, achieving improved accuracy.
Contribution
It introduces a customised CNN architecture with domain-specific pre-trained word vectors for misogynistic tweet detection on small datasets.
Findings
Pre-trained domain-specific word vectors improve CNN performance.
Regularisation enhances model accuracy with limited data.
The proposed CNN outperforms existing models in accuracy.
Abstract
Online abuse directed towards women on the social media platform Twitter has attracted considerable attention in recent years. An automated method to effectively identify misogynistic abuse could improve our understanding of the patterns, driving factors, and effectiveness of responses associated with abusive tweets over a sustained time period. However, training a neural network (NN) model with a small set of labelled data to detect misogynistic tweets is difficult. This is partly due to the complex nature of tweets which contain misogynistic content, and the vast number of parameters needed to be learned in a NN model. We have conducted a series of experiments to investigate how to train a NN model to detect misogynistic tweets effectively. In particular, we have customised and regularised a Convolutional Neural Network (CNN) architecture and shown that the word vectors pre-trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
