A Comparative Analysis of Noise Reduction Methods in Sentiment Analysis on Noisy Bangla Texts
Kazi Toufique Elahi, Tasnuva Binte Rahman, Shakil Shahriar, Samir, Sarker, Md. Tanvir Rouf Shawon, G. M. Shahariar

TL;DR
This paper introduces a new dataset and evaluates noise reduction methods for sentiment analysis on noisy Bangla texts, revealing current methods are inadequate and highlighting the need for improved techniques.
Contribution
It presents a manually annotated dataset of noisy Bangla texts, formulates noise identification as a multi-label classification task, and benchmarks baseline noise reduction methods for sentiment analysis.
Findings
Baseline noise reduction methods are ineffective on noisy Bangla texts.
Noise reduction methods do not significantly improve sentiment analysis accuracy.
The study provides publicly available dataset and implementation for future research.
Abstract
While Bangla is considered a language with limited resources, sentiment analysis has been a subject of extensive research in the literature. Nevertheless, there is a scarcity of exploration into sentiment analysis specifically in the realm of noisy Bangla texts. In this paper, we introduce a dataset (NC-SentNoB) that we annotated manually to identify ten different types of noise found in a pre-existing sentiment analysis dataset comprising of around 15K noisy Bangla texts. At first, given an input noisy text, we identify the noise type, addressing this as a multi-label classification task. Then, we introduce baseline noise reduction methods to alleviate noise prior to conducting sentiment analysis. Finally, we assess the performance of fine-tuned sentiment analysis models with both noisy and noise-reduced texts to make comparisons. The experimental findings indicate that the noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
