RoIt-XMASA: Multi-Domain Multilingual Sentiment Analysis Dataset for Romanian and Italian

Andrei-Marius Avram; Aureliu Valentin Antonie; Cosmin-Mircea Croitoru; Vlad Andrei Muntean; Dumitru-Clementin Cercel

arXiv:2604.17134·cs.CL·April 21, 2026

RoIt-XMASA: Multi-Domain Multilingual Sentiment Analysis Dataset for Romanian and Italian

Andrei-Marius Avram, Aureliu Valentin Antonie, Cosmin-Mircea Croitoru, Vlad Andrei Muntean, Dumitru-Clementin Cercel

PDF

TL;DR

This paper introduces RoIt-XMASA, a multilingual sentiment analysis dataset for Romanian and Italian, and proposes a multi-target adversarial training framework that improves cross-lingual and cross-domain sentiment classification.

Contribution

The paper presents a new multilingual dataset and a novel adversarial training method that enhances sentiment analysis across multiple languages and domains.

Findings

01

XLM-R achieves 66.23% F1-score with the proposed method.

02

Baseline performance is improved by 4.64% over previous methods.

03

Few-shot Llama-3.1-8B achieves 58.43% F1-score, showing a trade-off between prompting and fine-tuning.

Abstract

We present RoIt-XMASA, a multilingual dataset that extends the Cross-lingual Multi-domain Amazon Sentiment Analysis to Italian and Romanian, comprising 36,000 labeled reviews across three domains (books, movies, and music) and 202,141 unlabeled samples. To address cross-lingual and cross-domain challenges, we propose a multi-target adversarial training framework that employs loss reversal with meta-learned coefficients to dynamically balance sentiment discrimination with domain and language invariance. XLM-R achieves an F1-score of 66.23% with our approach, outperforming the baseline by 4.64%. Few-shot evaluation shows that Llama-3.1-8B achieves 58.43% F1-score, revealing a meaningful trade-off between the efficiency of prompting-based approaches and the higher performance of task-specific fine-tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.