Benchmarking PyCaret AutoML Against BiLSTM for Fine-Grained Emotion Classification: A Comparative Study on 20-Class Emotion Detection
Arya Muda Siregar, Arielva Simon Siahaan, Haikal Fransisko Simbolon, Luluk Muthoharoh, Ardika Satria, and Martin C.T. Manullang

TL;DR
This study compares classical machine learning models and deep learning approaches, especially BiLSTM, for 20-class emotion classification, highlighting BiLSTM's superior performance in capturing contextual cues.
Contribution
It provides a comprehensive benchmark of ML and deep learning models on a large emotion dataset, emphasizing BiLSTM's effectiveness.
Findings
BiLSTM achieves 89% accuracy, outperforming SVM.
Traditional ML models like SVM are competitive and efficient.
Deep learning models better capture contextual emotional cues.
Abstract
Fine-grained emotion classification, which identifies specific emotional states such as happiness, anger, sadness, and fear, remains a challenging task in natural language processing. This study benchmarks classical machine learning and deep learning approaches for 20-class emotion classification using the 20-Emotion Text Classification Dataset containing 79,595 English sentences. On the machine learning side, Logistic Regression, Multinomial Naive Bayes, and Support Vector Machine are evaluated using TF-IDF features. On the deep learning side, Bidirectional Long Short-Term Memory, Gated Recurrent Unit, and a lightweight Transformer implemented in PyTorch are compared. The results show that BiLSTM achieves the best overall performance with 89% accuracy and a weighted F1-score of 0.89, slightly outperforming the best machine learning model, SVM, which reaches 88.11% accuracy. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
