BanglaAbuseMeme: A Dataset for Bengali Abusive Meme Classification
Mithun Das, Animesh Mukherjee

TL;DR
This paper introduces BanglaAbuseMeme, a new dataset for Bengali abusive meme classification, and evaluates baseline multimodal models that combine image and text analysis, achieving a macro F1 score of 70.51.
Contribution
It provides the first benchmark dataset for Bengali abusive memes and demonstrates the effectiveness of multimodal models over unimodal ones.
Findings
Multimodal models outperform unimodal models in classifying abusive memes.
The best model achieves a macro F1 score of 70.51.
Qualitative error analysis highlights challenges in misclassification cases.
Abstract
The dramatic increase in the use of social media platforms for information sharing has also fueled a steep growth in online abuse. A simple yet effective way of abusing individuals or communities is by creating memes, which often integrate an image with a short piece of text layered on top of it. Such harmful elements are in rampant use and are a threat to online safety. Hence it is necessary to develop efficient models to detect and flag abusive memes. The problem becomes more challenging in a low-resource setting (e.g., Bengali memes, i.e., images with Bengali text embedded on it) because of the absence of benchmark datasets on which AI models could be trained. In this paper we bridge this gap by building a Bengali meme dataset. To setup an effective benchmark we implement several baseline models for classifying abusive memes using this dataset. We observe that multimodal models that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Sentiment Analysis and Opinion Mining
