Uncovering Political Hate Speech During Indian Election Campaign: A New   Low-Resource Dataset and Baselines

Farhan Ahmad Jafri; Mohammad Aman Siddiqui; Surendrabikram Thapa,; Kritesh Rauniyar; Usman Naseem; Imran Razzak

arXiv:2306.14764·cs.CL·June 29, 2023·2 cites

Uncovering Political Hate Speech During Indian Election Campaign: A New Low-Resource Dataset and Baselines

Farhan Ahmad Jafri, Mohammad Aman Siddiqui, Surendrabikram Thapa,, Kritesh Rauniyar, Usman Naseem, Imran Razzak

PDF

Open Access 1 Repo

TL;DR

This paper introduces IEHate, a new Hindi dataset of political tweets related to Indian elections, and benchmarks various models for hate speech detection, highlighting the need for advanced techniques in low-resource language contexts.

Contribution

The paper presents a novel annotated dataset for hate speech in Hindi political tweets and provides baseline evaluations using multiple machine learning models.

Findings

01

Human evaluation outperforms algorithms in hate speech detection.

02

Benchmarking reveals room for improvement in model performance.

03

Dataset facilitates research in low-resource language hate speech detection.

Abstract

The detection of hate speech in political discourse is a critical issue, and this becomes even more challenging in low-resource languages. To address this issue, we introduce a new dataset named IEHate, which contains 11,457 manually annotated Hindi tweets related to the Indian Assembly Election Campaign from November 1, 2021, to March 9, 2022. We performed a detailed analysis of the dataset, focusing on the prevalence of hate speech in political communication and the different forms of hateful language used. Additionally, we benchmark the dataset using a range of machine learning, deep learning, and transformer-based algorithms. Our experiments reveal that the performance of these models can be further improved, highlighting the need for more advanced techniques for hate speech detection in low-resource languages. In particular, the relatively higher score of human evaluation over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

farhan-jafri/indian-election
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting