Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models
Arman Sakif Chowdhury, G. M. Shahariar, Ahammed Tarik Aziz, Syed Mohibul Alam, Md. Azad Sheikh, Tanveer Ahmed Belal

TL;DR
This paper explores the use of summarization and augmentation techniques with pre-trained language models to improve fake news detection in Bengali, a low-resource language, demonstrating high accuracy through extensive experiments.
Contribution
It introduces a novel methodology combining summarization and augmentation for Bengali fake news detection using multiple pre-trained models, addressing low-resource challenges.
Findings
BanglaBERT with augmentation achieved 96% accuracy
Summarized augmented news improved detection accuracy
mBERT achieved 86% accuracy on generalization dataset
Abstract
With the rise of social media and online news sources, fake news has become a significant issue globally. However, the detection of fake news in low resource languages like Bengali has received limited attention in research. In this paper, we propose a methodology consisting of four distinct approaches to classify fake news articles in Bengali using summarization and augmentation techniques with five pre-trained language models. Our approach includes translating English news articles and using augmentation techniques to curb the deficit of fake news articles. Our research also focused on summarizing the news to tackle the token length limitation of BERT based models. Through extensive experimentation and rigorous evaluation, we show the effectiveness of summarization and augmentation in the case of Bengali fake news detection. We evaluated our models using three separate test datasets.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Residual Connection · Adam · Dense Connections · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia?
