Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models

Arman Sakif Chowdhury; G. M. Shahariar; Ahammed Tarik Aziz; Syed Mohibul Alam; Md. Azad Sheikh; Tanveer Ahmed Belal

arXiv:2307.06979·cs.CL·October 20, 2025·1 cites

Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models

Arman Sakif Chowdhury, G. M. Shahariar, Ahammed Tarik Aziz, Syed Mohibul Alam, Md. Azad Sheikh, Tanveer Ahmed Belal

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper explores the use of summarization and augmentation techniques with pre-trained language models to improve fake news detection in Bengali, a low-resource language, demonstrating high accuracy through extensive experiments.

Contribution

It introduces a novel methodology combining summarization and augmentation for Bengali fake news detection using multiple pre-trained models, addressing low-resource challenges.

Findings

01

BanglaBERT with augmentation achieved 96% accuracy

02

Summarized augmented news improved detection accuracy

03

mBERT achieved 86% accuracy on generalization dataset

Abstract

With the rise of social media and online news sources, fake news has become a significant issue globally. However, the detection of fake news in low resource languages like Bengali has received limited attention in research. In this paper, we propose a methodology consisting of four distinct approaches to classify fake news articles in Bengali using summarization and augmentation techniques with five pre-trained language models. Our approach includes translating English news articles and using augmentation techniques to curb the deficit of fake news articles. Our research also focused on summarizing the news to tackle the token length limitation of BERT based models. Through extensive experimentation and rigorous evaluation, we show the effectiveness of summarization and augmentation in the case of Bengali fake news detection. We evaluated our models using three separate test datasets.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arman-sakif/bengali-fake-news-detection
noneOfficial

Models

🤗
armansakif/bengali-fake-news
model· 224 dl· ♡ 1
224 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Residual Connection · Adam · Dense Connections · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia?