Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Akshat Gupta, Sargam Menghani, Sai Krishna Rallabandi, Alan W Black

TL;DR
This paper introduces an unsupervised self-training framework leveraging pre-trained BERT models for sentiment analysis in code-switched social media data, addressing the scarcity of annotated datasets.
Contribution
It presents a novel unsupervised approach that fine-tunes BERT models using pseudo labels from zero-shot transfer for code-switched sentiment analysis.
Findings
Unsupervised models achieve 1-7% lower weighted F1 scores than supervised models.
The framework effectively handles multiple code-switched languages.
Analysis suggests the model learns meaningful representations of code-switched language.
Abstract
Sentiment analysis is an important task in understanding social media content like customer reviews, Twitter and Facebook feeds etc. In multilingual communities around the world, a large amount of social media text is characterized by the presence of Code-Switching. Thus, it has become important to build models that can handle code-switched data. However, annotated code-switched data is scarce and there is a need for unsupervised models and algorithms. We propose a general framework called Unsupervised Self-Training and show its applications for the specific use case of sentiment analysis of code-switched data. We use the power of pre-trained BERT models for initialization and fine-tune them in an unsupervised manner, only using pseudo labels produced by zero-shot transfer. We test our algorithm on multiple code-switched languages and provide a detailed analysis of the learning dynamics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Weight Decay · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Softmax · Dense Connections · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout
