Benchmarking PyCaret AutoML Against IndoBERT Fine-Tuning for Sentiment Analysis on Indonesian IKN Twitter Data
Mutia Alfi Mayzaroh, Dwi Fitria Ningsih, Nindi Destriani, and Martin C.T. Manullang

TL;DR
This paper compares classical AutoML methods with IndoBERT fine-tuning for Indonesian Twitter sentiment analysis, demonstrating IndoBERT's superior performance on a small dataset.
Contribution
It provides a benchmark showing the effectiveness of Transformer-based models over traditional machine learning for Indonesian social media sentiment analysis.
Findings
IndoBERT achieved 89.59% accuracy, outperforming classical models.
Logistic Regression was the best classical model with 77.57% accuracy.
Transformer-based IndoBERT significantly outperforms traditional ML methods.
Abstract
This paper benchmarks a classical machine learning approach based on PyCaret AutoML against a deep learning approach based on IndoBERT fine-tuning for binary sentiment analysis of Indonesian-language Twitter comments related to Ibu Kota Nusantara (IKN). The dataset contains 1,472 manually labeled samples, consisting of 780 negative and 692 positive comments. In the machine learning setting, Logistic Regression, Naive Bayes, and Support Vector Machine were evaluated using 10-fold cross-validation, with Logistic Regression achieving the best performance among the classical models at 77.57% accuracy and 77.17% F1-score. In the deep learning setting, the indobenchmark/indobert-base-p1 model was fine-tuned for five epochs and achieved 89.59% test accuracy and 89.37% F1-score. The results show that IndoBERT substantially outperforms the machine learning baselines, highlighting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
