NUIG-Shubhanker@Dravidian-CodeMix-FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet
Shubhanker Banerjee, Arun Jayapal, Sajeetha Thavareesan

TL;DR
This paper presents a sentiment analysis approach for code-mixed Tamil-English and Malayalam-English social media data using an XLNet model, addressing the challenges of multilingual and code-mixed NLP tasks.
Contribution
It introduces an application of XLNet for sentiment analysis on code-mixed Dravidian languages, which is a novel approach for this multilingual NLP challenge.
Findings
XLNet outperforms traditional models on code-mixed datasets
Effective handling of multilingual code-mixed sentiment analysis
Demonstrates feasibility of transformer models for low-resource language tasks
Abstract
Social media has penetrated into multilingual societies, however most of them use English to be a preferred language for communication. So it looks natural for them to mix their cultural language with English during conversations resulting in abundance of multilingual data, call this code-mixed data, available in todays' world.Downstream NLP tasks using such data is challenging due to the semantic nature of it being spread across multiple languages.One such Natural Language Processing task is sentiment analysis, for this we use an auto-regressive XLNet model to perform sentiment analysis on code-mixed Tamil-English and Malayalam-English datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Sentiment Analysis and Opinion Mining · Topic Modeling
MethodsLinear Layer · Attention Is All You Need · Adam · Byte Pair Encoding · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Dropout · SentencePiece
