University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection
Ikhlasul Akmal Hanif, Eryawan Presma Yulianrifat, Jaycent Gunawan Ongris, Eduardus Tjitrahardja, Muhammad Falensi Azmi, Rahmat Bryan Naufal, Alfan Farizki Wicaksono

TL;DR
This paper evaluates various encoder models for multi-label emotion detection across 28 languages, finding that prompt-based encoders with classifier-only training outperform fully fine-tuned models, with ensemble methods achieving the best results.
Contribution
It introduces a comprehensive comparison of fine-tuning versus classifier-only training strategies for multilingual emotion detection using state-of-the-art encoders.
Findings
Prompt-based encoders like mE5 and BGE outperform fully fine-tuned models.
Ensemble of BGE models with CatBoost achieves 56.58 F1-macro score.
Classifier-only training is more effective than full fine-tuning in this task.
Abstract
This paper presents our approach for SemEval 2025 Task 11 Track A, focusing on multilabel emotion classification across 28 languages. We explore two main strategies: fully fine-tuning transformer models and classifier-only training, evaluating different settings such as fine-tuning strategies, model architectures, loss functions, encoders, and classifiers. Our findings suggest that training a classifier on top of prompt-based encoders such as mE5 and BGE yields significantly better results than fully fine-tuning XLMR and mBERT. Our best-performing model on the final leaderboard is an ensemble combining multiple BGE models, where CatBoost serves as the classifier, with different configurations. This ensemble achieves an average F1-macro score of 56.58 across all languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection · Mental Health via Writing
MethodsmBERT
