M2SA: Multimodal and Multilingual Model for Sentiment Analysis of Tweets
Gaurish Thakkar, Sherzod Hakimov, Marko Tadi\'c

TL;DR
This paper introduces M2SA, a multimodal and multilingual sentiment analysis model for tweets, addressing the gap in non-English data and demonstrating the effectiveness of multimodal approaches with sentiment-tuned language models.
Contribution
It transforms an English Twitter sentiment dataset into a multimodal, multilingual dataset and provides baseline experiments showing the advantages of multimodal models.
Findings
Multimodal models outperform unimodal ones in sentiment analysis.
Sentiment-tuned large language models excel as text encoders.
The dataset enables new multilingual sentiment research.
Abstract
In recent years, multimodal natural language processing, aimed at learning from diverse data types, has garnered significant attention. However, there needs to be more clarity when it comes to analysing multimodal tasks in multi-lingual contexts. While prior studies on sentiment analysis of tweets have predominantly focused on the English language, this paper addresses this gap by transforming an existing textual Twitter sentiment dataset into a multimodal format through a straightforward curation process. Our work opens up new avenues for sentiment-related research within the research community. Additionally, we conduct baseline experiments utilising this augmented dataset and report the findings. Notably, our evaluations reveal that when comparing unimodal and multimodal configurations, using a sentiment-tuned large language model as a text encoder performs exceptionally well.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
