Sentiment Classification of Code-Switched Text using Pre-trained   Multilingual Embeddings and Segmentation

Saurav K. Aryal; Howard Prioleau; and Gloria Washington

arXiv:2210.16461·cs.CL·November 1, 2022

Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation

Saurav K. Aryal, Howard Prioleau, and Gloria Washington

PDF

Open Access

TL;DR

This paper presents a novel sentiment classification method for code-switched text using pre-trained multilingual embeddings and segmentation, significantly improving accuracy over baseline models on a Spanish-English dataset.

Contribution

It introduces a multi-step NLP algorithm that leverages code-switching points and semantic similarity from multilingual models for sentiment analysis in mixed-language text.

Findings

01

Outperforms baseline by 11.2% in accuracy

02

Achieves 11.64% higher F1-score

03

Applicable to multiple languages with limited human effort

Abstract

With increasing globalization and immigration, various studies have estimated that about half of the world population is bilingual. Consequently, individuals concurrently use two or more languages or dialects in casual conversational settings. However, most research is natural language processing is focused on monolingual text. To further the work in code-switched sentiment analysis, we propose a multi-step natural language processing algorithm utilizing points of code-switching in mixed text and conduct sentiment analysis around those identified points. The proposed sentiment analysis algorithm uses semantic similarity derived from large pre-trained multilingual models with a handcrafted set of positive and negative words to determine the polarity of code-switched text. The proposed approach outperforms a comparable baseline model by 11.2% for accuracy and 11.64% for F1-score on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Sentiment Analysis and Opinion Mining · Topic Modeling