Hateminers : Detecting Hate speech against Women
Punyajoy Saha, Binny Mathew, Pawan Goyal, Animesh Mukherjee

TL;DR
This paper presents machine learning models for detecting misogyny in tweets, achieving top results in the EVALITA 2018 shared task by combining sentence embeddings, TF-IDF, and BOW features.
Contribution
The paper introduces a feature engineering approach and a machine learning pipeline that effectively detects hate speech against women in social media content.
Findings
Achieved first place in English Subtask A
Achieved fifth place in English Subtask B
Released a publicly available model for hate speech detection
Abstract
With the online proliferation of hate speech, there is an urgent need for systems that can detect such harmful content. In this paper, We present the machine learning models developed for the Automatic Misogyny Identification (AMI) shared task at EVALITA 2018. We generate three types of features: Sentence Embeddings, TF-IDF Vectors, and BOW Vectors to represent each tweet. These features are then concatenated and fed into the machine learning models. Our model came First for the English Subtask A and Fifth for the English Subtask B. We release our winning model for public use and it's available at https://github.com/punyajoy/Hateminers-EVALITA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Freedom of Expression and Defamation · Swearing, Euphemism, Multilingualism
