Sentiment-Aware Measure (SAM) for Evaluating Sentiment Transfer by Machine Translation Systems
Hadeel Saadany, Constantin Orasan, Emad Mohamed, Ashraf Tantawy

TL;DR
This paper introduces a sentiment-aware evaluation measure for machine translation that better aligns automatic metrics with human judgments on sentiment accuracy, especially in user-generated content.
Contribution
It proposes a novel sentiment-closeness measure to evaluate how well MT systems preserve sentiment, improving correlation with human assessments.
Findings
Sentiment-aware measure improves correlation with human judgment.
Conventional metrics often miss sentiment translation errors.
The proposed measure effectively detects sentiment mistranslations.
Abstract
In translating text where sentiment is the main message, human translators give particular attention to sentiment-carrying words. The reason is that an incorrect translation of such words would miss the fundamental aspect of the source text, i.e. the author's sentiment. In the online world, MT systems are extensively used to translate User-Generated Content (UGC) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards the topic of the text. It is important in such scenarios to accurately measure how far an MT system can be a reliable real-life utility in transferring the correct affect message. This paper tackles an under-recognised problem in the field of machine translation evaluation which is judging to what extent automatic metrics concur with the gold standard of human evaluation for a correct translation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
