Analyzing Emotions in Bangla Social Media Comments Using Machine Learning and LIME
Bidyarthi Paul, SM Musfiqur Rahman, Dipta Biswas, Md. Ziaul Hasan, Md. Zahid Hossain

TL;DR
This paper explores emotion detection in Bangla social media comments using machine learning models and interpretability tools like LIME, addressing challenges in low-resource language sentiment analysis.
Contribution
It introduces a comprehensive approach combining multiple machine learning models and LIME for explainability in Bangla emotion analysis, an understudied language.
Findings
Random Forest achieved high accuracy in emotion classification.
LIME effectively explained model predictions for better interpretability.
Dimensionality reduction with PCA influenced model performance.
Abstract
Research on understanding emotions in written language continues to expand, especially for understudied languages with distinctive regional expressions and cultural features, such as Bangla. This study examines emotion analysis using 22,698 social media comments from the EmoNoBa dataset. For language analysis, we employ machine learning models: Linear SVM, KNN, and Random Forest with n-gram data from a TF-IDF vectorizer. We additionally investigated how PCA affects the reduction of dimensionality. Moreover, we utilized a BiLSTM model and AdaBoost to improve decision trees. To make our machine learning models easier to understand, we used LIME to explain the predictions of the AdaBoost classifier, which uses decision trees. With the goal of advancing sentiment analysis in languages with limited resources, our work examines various techniques to find efficient techniques for emotion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Mental Health via Writing · Hate Speech and Cyberbullying Detection
MethodsLong Short-Term Memory · Principal Components Analysis · Support Vector Machine · Local Interpretable Model-Agnostic Explanations · Bidirectional LSTM
