BanglaMM-Disaster: A Multimodal Transformer-Based Deep Learning Framework for Multiclass Disaster Classification in Bangla
Ariful Islam, Md Rifat Hossen, Md. Mahmudul Arif, Abdullah Al Noman, Md Arifur Rahman

TL;DR
This paper introduces BanglaMM-Disaster, a multimodal deep learning framework that combines text and images from social media to classify nine types of disasters in Bangla, achieving high accuracy and improved misclassification rates.
Contribution
It presents a novel multimodal transformer-based framework and a new dataset for disaster classification in Bangla, enhancing real-time disaster monitoring in low-resource settings.
Findings
Achieved 83.76% accuracy with the best model.
Outperformed text-only and image-only baselines by 3.84% and 16.91%.
Reduced misclassification, especially for ambiguous cases.
Abstract
Natural disasters remain a major challenge for Bangladesh, so real-time monitoring and quick response systems are essential. In this study, we present BanglaMM-Disaster, an end-to-end deep learning-based multimodal framework for disaster classification in Bangla, using both textual and visual data from social media. We constructed a new dataset of 5,037 Bangla social media posts, each consisting of a caption and a corresponding image, annotated into one of nine disaster-related categories. The proposed model integrates transformer-based text encoders, including BanglaBERT, mBERT, and XLM-RoBERTa, with CNN backbones such as ResNet50, DenseNet169, and MobileNetV2, to process the two modalities. Using early fusion, the best model achieves 83.76% accuracy. This surpasses the best text-only baseline by 3.84% and the image-only baseline by 16.91%. Our analysis also shows reduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPublic Relations and Crisis Communication · Disaster Management and Resilience · Multimodal Machine Learning Applications
