Progressive Sentiment Analysis for Code-Switched Text Data
Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

TL;DR
This paper introduces a progressive training framework for sentiment analysis on code-switched text, effectively leveraging resource-rich and low-resource languages to improve performance in multilingual NLP tasks.
Contribution
It proposes a novel progressive training method that accounts for language resource disparity, enhancing sentiment analysis on code-switched data.
Findings
Progressive training improves low-resource language performance.
Method outperforms baseline models on multiple language pairs.
Training from resource-rich to low-resource samples is effective.
Abstract
Multilingual transformer language models have recently attracted much attention from researchers and are used in cross-lingual transfer learning for many NLP tasks such as text classification and named entity recognition. However, similar methods for transfer learning from monolingual text to code-switched text have not been extensively explored mainly due to the following challenges: (1) Code-switched corpus, unlike monolingual corpus, consists of more than one language and existing methods can't be applied efficiently, (2) Code-switched corpus is usually made of resource-rich and low-resource languages and upon using multilingual pre-trained language models, the final model might bias towards resource-rich language. In this paper, we focus on code-switched sentiment analysis where we have a labelled resource-rich language dataset and unlabelled code-switched data. We propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
