Retrieval Augmented Enhanced Dual Co-Attention Framework for Target Aware Multimodal Bengali Hateful Meme Detection
Raihan Tanvir, Md. Golam Rabiul Alam

TL;DR
This paper introduces a retrieval-augmented dual co-attention framework for detecting hate speech in Bengali multimodal memes, addressing low-resource challenges with dataset augmentation, cross-modal learning, and non-parametric inference.
Contribution
It proposes the xDORA framework combining vision and multilingual text encoders with retrieval-based reasoning, improving hate meme detection in Bengali.
Findings
xDORA achieves macro F1-scores of 0.78 and 0.71 for hate and target detection.
RAG-Fused DORA improves performance to 0.79 and 0.74, surpassing the baseline.
FAISS-based classifier demonstrates robustness for rare classes.
Abstract
Hateful content on social media increasingly appears as multimodal memes that combine images and text to convey harmful narratives. In low-resource languages such as Bengali, automated detection remains challenging due to limited annotated data, class imbalance, and pervasive code-mixing. To address these issues, we augment the Bengali Hateful Memes (BHM) dataset with semantically aligned samples from the Multimodal Aggression Dataset in Bengali (MIMOSA), improving both class balance and semantic diversity. We propose the Enhanced Dual Co-attention Framework (xDORA), integrating vision encoders (CLIP, DINOv2) and multilingual text encoders (XGLM, XLM-R) via weighted attention pooling to learn robust cross-modal representations. Building on these embeddings, we develop a FAISS-based k-nearest neighbor classifier for non-parametric inference and introduce RAG-Fused DORA, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts
