WESSA at SemEval-2020 Task 9: Code-Mixed Sentiment Analysis using   Transformers

Ahmed Sultan (WideBot); Mahmoud Salim (WideBot); Amina Gaber; (WideBot); Islam El Hosary (WideBot)

arXiv:2009.09879·cs.CL·September 22, 2020

WESSA at SemEval-2020 Task 9: Code-Mixed Sentiment Analysis using Transformers

Ahmed Sultan (WideBot), Mahmoud Salim (WideBot), Amina Gaber, (WideBot), Islam El Hosary (WideBot)

PDF

TL;DR

This paper presents a transformer-based transfer learning approach for sentiment analysis of code-mixed social media text, achieving state-of-the-art results by fine-tuning multilingual models on monolingual and code-mixed data.

Contribution

The paper introduces a transfer learning method using XLM-RoBERTa for code-mixed sentiment analysis, outperforming baseline models on SemEval-2020 Task 9.

Findings

01

Achieved 70.1% F1-score on official leaderboard

02

Improved to 75.9% F1-score in subsequent submissions

03

Demonstrated effectiveness of multilingual transformers on code-mixed data

Abstract

In this paper, we describe our system submitted for SemEval 2020 Task 9, Sentiment Analysis for Code-Mixed Social Media Text alongside other experiments. Our best performing system is a Transfer Learning-based model that fine-tunes "XLM-RoBERTa", a transformer-based multilingual masked language model, on monolingual English and Spanish data and Spanish-English code-mixed data. Our system outperforms the official task baseline by achieving a 70.1% average F1-Score on the official leaderboard using the test set. For later submissions, our system manages to achieve a 75.9% average F1-Score on the test set using CodaLab username "ahmed0sultan".

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.