Hateminers : Detecting Hate speech against Women

Punyajoy Saha; Binny Mathew; Pawan Goyal; Animesh Mukherjee

arXiv:1812.06700·cs.SI·December 18, 2018·37 cites

Hateminers : Detecting Hate speech against Women

Punyajoy Saha, Binny Mathew, Pawan Goyal, Animesh Mukherjee

PDF

Open Access 2 Repos

TL;DR

This paper presents machine learning models for detecting misogyny in tweets, achieving top results in the EVALITA 2018 shared task by combining sentence embeddings, TF-IDF, and BOW features.

Contribution

The paper introduces a feature engineering approach and a machine learning pipeline that effectively detects hate speech against women in social media content.

Findings

01

Achieved first place in English Subtask A

02

Achieved fifth place in English Subtask B

03

Released a publicly available model for hate speech detection

Abstract

With the online proliferation of hate speech, there is an urgent need for systems that can detect such harmful content. In this paper, We present the machine learning models developed for the Automatic Misogyny Identification (AMI) shared task at EVALITA 2018. We generate three types of features: Sentence Embeddings, TF-IDF Vectors, and BOW Vectors to represent each tweet. These features are then concatenated and fed into the machine learning models. Our model came First for the English Subtask A and Fifth for the English Subtask B. We release our winning model for public use and it's available at https://github.com/punyajoy/Hateminers-EVALITA.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Freedom of Expression and Defamation · Swearing, Euphemism, Multilingualism