TL;DR
This paper introduces new Bengali sentiment analysis datasets and demonstrates that transfer learning with multi-lingual BERT improves sentiment classification accuracy in Bengali, a low-resource language.
Contribution
The paper presents the first Bengali sentiment analysis datasets and applies transfer learning with multi-lingual BERT to enhance classification performance.
Findings
Achieved 71% accuracy for 2-class sentiment classification.
First Bengali 3-class sentiment classifier with 60% accuracy.
Analyzed public comments showing sentiment trends in news articles.
Abstract
Sentiment analysis (SA) in Bengali is challenging due to this Indo-Aryan language's highly inflected properties with more than 160 different inflected forms for verbs and 36 different forms for noun and 24 different forms for pronouns. The lack of standard labeled datasets in the Bengali domain makes the task of SA even harder. In this paper, we present manually tagged 2-class and 3-class SA datasets in Bengali. We also demonstrate that the multi-lingual BERT model with relevant extensions can be trained via the approach of transfer learning over those novel datasets to improve the state-of-the-art performance in sentiment classification tasks. This deep learning model achieves an accuracy of 71\% for 2-class sentiment classification compared to the current state-of-the-art accuracy of 68\%. We also present the very first Bengali SA classifier for the 3-class manually tagged dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Linear Warmup With Linear Decay · Attention Is All You Need · Layer Normalization · Dropout · Weight Decay · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Attention Dropout
