# Embedding Projection for Targeted Cross-Lingual Sentiment: Model   Comparisons and a Real-World Study

**Authors:** Jeremy Barnes, Roman Klinger

arXiv: 1906.10519 · 2019-06-26

## TL;DR

This paper introduces a cross-lingual sentiment analysis model that enhances under-resourced languages by embedding sentiment information into bilingual representations, achieving state-of-the-art results and providing insights into resource and domain effects.

## Contribution

It proposes a novel embedding projection method that incorporates sentiment into bilingual representations, improving targeted sentiment analysis across multiple languages and domains.

## Key findings

- State-of-the-art sentence-level performance with machine translation.
- Outperforms other projection-based methods on binary targeted sentiment tasks.
- Unlabeled monolingual data has limited impact on sentiment results.

## Abstract

Sentiment analysis benefits from large, hand-annotated resources in order to train and test machine learning models, which are often data hungry. While some languages, e.g., English, have a vast array of these resources, most under-resourced languages do not, especially for fine-grained sentiment tasks, such as aspect-level or targeted sentiment analysis. To improve this situation, we propose a cross-lingual approach to sentiment analysis that is applicable to under-resourced languages and takes into account target-level information. This model incorporates sentiment information into bilingual distributional representations, by jointly optimizing them for semantics and sentiment, showing state-of-the-art performance at sentence-level when combined with machine translation. The adaptation to targeted sentiment analysis on multiple domains shows that our model outperforms other projection-based bilingual embedding methods on binary targeted sentiment tasks. Our analysis on ten languages demonstrates that the amount of unlabeled monolingual data has surprisingly little effect on the sentiment results. As expected, the choice of annotated source language for projection to a target leads to better results for source-target language pairs which are similar. Therefore, our results suggest that more efforts should be spent on the creation of resources for less similar languages to those which are resource-rich already. Finally, a domain mismatch leads to a decreased performance. This suggests resources in any language should ideally cover varieties of domains.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.10519/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1906.10519/full.md

## References

111 references — full list in the complete paper: https://tomesphere.com/paper/1906.10519/full.md

---
Source: https://tomesphere.com/paper/1906.10519