IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate   Speech Detection and Target Identification in Devanagari-Scripted Languages

Siddhant Gupta; Siddh Singhal; and Azmine Toushik Wasi

arXiv:2412.17947·cs.CL·December 31, 2024

IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages

Siddhant Gupta, Siddh Singhal, and Azmine Toushik Wasi

PDF

Open Access

TL;DR

This paper presents a multilingual hate speech detection and target identification system for Devanagari-scripted languages, utilizing a transformer-based model to handle linguistic diversity and transliteration challenges.

Contribution

It introduces the MultilingualRobertaClass model, optimized for multilingual and transliterated text classification in Devanagari-scripted languages.

Findings

01

88.40% accuracy in hate speech detection

02

66.11% accuracy in target identification

03

Effective handling of linguistic diversity and transliteration

Abstract

This work focuses on two subtasks related to hate speech detection and target identification in Devanagari-scripted languages, specifically Hindi, Marathi, Nepali, Bhojpuri, and Sanskrit. Subtask B involves detecting hate speech in online text, while Subtask C requires identifying the specific targets of hate speech, such as individuals, organizations, or communities. We propose the MultilingualRobertaClass model, a deep neural network built on the pretrained multilingual transformer model ia-multilingual-transliterated-roberta, optimized for classification tasks in multilingual and transliterated contexts. The model leverages contextualized embeddings to handle linguistic diversity, with a classifier head for binary classification. We received 88.40% accuracy in Subtask B and 66.11% accuracy in Subtask C, in the test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Swearing, Euphemism, Multilingualism