TL;DR
This paper introduces Bilingual Sentiment Embeddings (BLSE), a model that effectively captures sentiment information across languages with minimal resources, improving cross-lingual sentiment classification especially in low-resource languages.
Contribution
BLSE is a novel joint embedding model that requires only a small bilingual lexicon and monolingual embeddings, outperforming existing methods in cross-lingual sentiment tasks.
Findings
Outperforms state-of-the-art on 4 of 6 setups
Effectively captures sentiment in resource-poor languages
Complementary to machine translation approaches
Abstract
Sentiment analysis in low-resource languages suffers from a lack of annotated corpora to estimate high-performing models. Machine translation and bilingual word embeddings provide some relief through cross-lingual sentiment approaches. However, they either require large amounts of parallel data or do not sufficiently capture sentiment information. We introduce Bilingual Sentiment Embeddings (BLSE), which jointly represent sentiment information in a source and target language. This model only requires a small bilingual lexicon, a source-language corpus annotated for sentiment, and monolingual word embeddings for each language. We perform experiments on three language combinations (Spanish, Catalan, Basque) for sentence-level cross-lingual sentiment classification and find that our model significantly outperforms state-of-the-art methods on four out of six experimental setups, as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
