Automatic Detection of Satire in Bangla Documents: A CNN Approach Based   on Hybrid Feature Extraction Model

Arnab Sen Sharma; Maruf Ahmed Mridul; Md Saiful Islam

arXiv:1911.11062·cs.IR·December 2, 2019

Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model

Arnab Sen Sharma, Maruf Ahmed Mridul, Md Saiful Islam

PDF

TL;DR

This paper presents a CNN-based method utilizing a hybrid feature extraction model combining Word2Vec and TF-IDF to accurately detect satirical Bangla news with over 96% accuracy.

Contribution

It introduces a novel hybrid feature extraction approach for satire detection in Bangla texts, enhancing accuracy over existing methods.

Findings

01

Achieved over 96% accuracy in satire detection.

02

Demonstrated effectiveness of hybrid Word2Vec and TF-IDF features.

03

Validated the approach on Bangla news datasets.

Abstract

Widespread of satirical news in online communities is an ongoing trend. The nature of satires is so inherently ambiguous that sometimes it's too hard even for humans to understand whether it's actually satire or not. So, research interest has grown in this field. The purpose of this research is to detect Bangla satirical news spread in online news portals as well as social media. In this paper, we propose a hybrid technique for extracting features from text documents combining Word2Vec and TF-IDF. Using our proposed feature extraction technique, with standard CNN architecture we could detect whether a Bangla text document is satire or not with an accuracy of more than 96%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.