Uncovering Political Hate Speech During Indian Election Campaign: A New Low-Resource Dataset and Baselines
Farhan Ahmad Jafri, Mohammad Aman Siddiqui, Surendrabikram Thapa,, Kritesh Rauniyar, Usman Naseem, Imran Razzak

TL;DR
This paper introduces IEHate, a new Hindi dataset of political tweets related to Indian elections, and benchmarks various models for hate speech detection, highlighting the need for advanced techniques in low-resource language contexts.
Contribution
The paper presents a novel annotated dataset for hate speech in Hindi political tweets and provides baseline evaluations using multiple machine learning models.
Findings
Human evaluation outperforms algorithms in hate speech detection.
Benchmarking reveals room for improvement in model performance.
Dataset facilitates research in low-resource language hate speech detection.
Abstract
The detection of hate speech in political discourse is a critical issue, and this becomes even more challenging in low-resource languages. To address this issue, we introduce a new dataset named IEHate, which contains 11,457 manually annotated Hindi tweets related to the Indian Assembly Election Campaign from November 1, 2021, to March 9, 2022. We performed a detailed analysis of the dataset, focusing on the prevalence of hate speech in political communication and the different forms of hateful language used. Additionally, we benchmark the dataset using a range of machine learning, deep learning, and transformer-based algorithms. Our experiments reveal that the performance of these models can be further improved, highlighting the need for more advanced techniques for hate speech detection in low-resource languages. In particular, the relatively higher score of human evaluation over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting
