Fine-tuning multilingual language models in Twitter/X sentiment analysis: a study on Eastern-European V4 languages
Tom\'a\v{s} Filip, Martin Pavl\'i\v{c}ek, Petr Sos\'ik

TL;DR
This study evaluates fine-tuning multilingual language models for Twitter/X sentiment analysis in Eastern-European V4 languages, demonstrating that small models can outperform larger ones in narrow tasks with limited data.
Contribution
It provides a comparative analysis of various LLMs fine-tuned for sentiment classification in underrepresented languages on Twitter/X data, highlighting effective configurations and the potential of small models.
Findings
Some models outperform larger ones in narrow sentiment tasks
Small training sets can achieve state-of-the-art performance
Optimal settings vary across models and tasks
Abstract
The aspect-based sentiment analysis (ABSA) is a standard NLP task with numerous approaches and benchmarks, where large language models (LLM) represent the current state-of-the-art. We focus on ABSA subtasks based on Twitter/X data in underrepresented languages. On such narrow tasks, small tuned language models can often outperform universal large ones, providing available and cheap solutions. We fine-tune several LLMs (BERT, BERTweet, Llama2, Llama3, Mistral) for classification of sentiment towards Russia and Ukraine in the context of the ongoing military conflict. The training/testing dataset was obtained from the academic API from Twitter/X during 2023, narrowed to the languages of the V4 countries (Czech Republic, Slovakia, Poland, Hungary). Then we measure their performance under a variety of settings including translations, sentiment targets, in-context learning and more, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Digital Communication and Language
MethodsFocus
