Exploiting Vietnamese Social Media Characteristics for Textual Emotion Recognition in Vietnamese
Khang Phuoc-Quy Nguyen, Kiet Van Nguyen

TL;DR
This paper investigates how Vietnamese social media characteristics influence textual emotion recognition, demonstrating that tailored pre-processing techniques significantly improve machine learning performance on a Vietnamese social media emotion dataset.
Contribution
The study introduces specific pre-processing methods based on Vietnamese social media traits and shows their effectiveness in enhancing emotion recognition accuracy.
Findings
Pre-processing techniques improved F1-score by 4.66%.
Multinomial Logistic Regression outperformed CNN models.
Vietnamese social media features are crucial for emotion detection.
Abstract
Textual emotion recognition has been a promising research topic in recent years. Many researchers aim to build more accurate and robust emotion detection systems. In this paper, we conduct several experiments to indicate how data pre-processing affects a machine learning method on textual emotion recognition. These experiments are performed on the Vietnamese Social Media Emotion Corpus (UIT-VSMEC) as the benchmark dataset. We explore Vietnamese social media characteristics to propose different pre-processing techniques, and key-clause extraction with emotional context to improve the machine performance on UIT-VSMEC. Our experimental evaluation shows that with appropriate pre-processing techniques based on Vietnamese social media characteristics, Multinomial Logistic Regression (MLR) achieves the best F1-score of 64.40%, a significant improvement of 4.66% over the CNN model built by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLogistic Regression
