Persian Emotion Detection using ParsBERT and Imbalanced Data Handling Approaches
Amirhossein Abaskohi, Nazanin Sabri, Behnam Bahrak

TL;DR
This paper improves Persian emotion detection by applying data augmentation, re-sampling, class-weights, and feature selection with Transformer models, achieving new state-of-the-art results on EmoPars and ArmanEmo datasets.
Contribution
It introduces a novel data selection policy and comprehensive imbalance handling techniques for Persian emotion recognition using PLMs.
Findings
Achieved macro F1-score of 0.81 on ArmanEmo
Achieved macro F1-score of 0.76 on EmoPars
Set new state-of-the-art results on both datasets
Abstract
Emotion recognition is one of the machine learning applications which can be done using text, speech, or image data gathered from social media spaces. Detecting emotion can help us in different fields, including opinion mining. With the spread of social media, different platforms like Twitter have become data sources, and the language used in these platforms is informal, making the emotion detection task difficult. EmoPars and ArmanEmo are two new human-labeled emotion datasets for the Persian language. These datasets, especially EmoPars, are suffering from inequality between several samples between two classes. In this paper, we evaluate EmoPars and compare them with ArmanEmo. Throughout this analysis, we use data augmentation techniques, data re-sampling, and class-weights with Transformer-based Pretrained Language Models(PLMs) to handle the imbalance problem of these datasets.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Web Data Mining and Analysis
MethodsFeature Selection
