How Effectively Can BERT Models Interpret Context and Detect Bengali Communal Violent Text?

Abdullah Khondoker; Enam Ahmed Taufik; Md. Iftekhar Islam Tashik; S M Ishtiak Mahmud; Farig Sadeque

arXiv:2506.19831·cs.CL·June 25, 2025

How Effectively Can BERT Models Interpret Context and Detect Bengali Communal Violent Text?

Abdullah Khondoker, Enam Ahmed Taufik, Md. Iftekhar Islam Tashik, S M Ishtiak Mahmud, Farig Sadeque

PDF

Open Access

TL;DR

This paper develops and evaluates a BanglaBERT-based model for detecting Bengali social media texts inciting communal violence, highlighting challenges in context understanding and interpretability.

Contribution

Introduces a fine-tuned BanglaBERT model and ensemble approach for communal violence detection, with analysis of interpretability and model limitations.

Findings

01

Ensemble model improved macro F1 score to 0.63

02

Model struggled with context understanding and closely related terms

03

Interpretability analysis revealed specific model limitations

Abstract

The spread of cyber hatred has led to communal violence, fueling aggression and conflicts between various religious, ethnic, and social groups, posing a significant threat to social harmony. Despite its critical importance, the classification of communal violent text remains an underexplored area in existing research. This study aims to enhance the accuracy of detecting text that incites communal violence, focusing specifically on Bengali textual data sourced from social media platforms. We introduce a fine-tuned BanglaBERT model tailored for this task, achieving a macro F1 score of 0.60. To address the issue of data imbalance, our dataset was expanded by adding 1,794 instances, which facilitated the development and evaluation of a fine-tuned ensemble model. This ensemble model demonstrated an improved performance, achieving a macro F1 score of 0.63, thus highlighting its effectiveness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection