Progressive Sentiment Analysis for Code-Switched Text Data

Sudhanshu Ranjan; Dheeraj Mekala; Jingbo Shang

arXiv:2210.14380·cs.CL·October 27, 2022

Progressive Sentiment Analysis for Code-Switched Text Data

Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a progressive training framework for sentiment analysis on code-switched text, effectively leveraging resource-rich and low-resource languages to improve performance in multilingual NLP tasks.

Contribution

It proposes a novel progressive training method that accounts for language resource disparity, enhancing sentiment analysis on code-switched data.

Findings

01

Progressive training improves low-resource language performance.

02

Method outperforms baseline models on multiple language pairs.

03

Training from resource-rich to low-resource samples is effective.

Abstract

Multilingual transformer language models have recently attracted much attention from researchers and are used in cross-lingual transfer learning for many NLP tasks such as text classification and named entity recognition. However, similar methods for transfer learning from monolingual text to code-switched text have not been extensively explored mainly due to the following challenges: (1) Code-switched corpus, unlike monolingual corpus, consists of more than one language and existing methods can't be applied efficiently, (2) Code-switched corpus is usually made of resource-rich and low-resource languages and upon using multilingual pre-trained language models, the final model might bias towards resource-rich language. In this paper, we focus on code-switched sentiment analysis where we have a labelled resource-rich language dataset and unlabelled code-switched data. We propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

s1998/progressivetraincodeswitch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis