YouTube Comments Decoded: Leveraging LLMs for Low Resource Language Classification
Aniket Deroy, Subhankar Maity

TL;DR
This paper explores using large language models like GPT-3.5 Turbo to classify sarcasm in code-mixed Tamil-English and Malayalam-English social media comments, addressing challenges in low-resource language sentiment analysis.
Contribution
It introduces a new dataset for sarcasm detection in code-mixed languages and demonstrates the effectiveness of LLM prompting for this task.
Findings
Achieved macro-F1 of 0.61 for Tamil
Achieved macro-F1 of 0.50 for Malayalam
Showed LLM prompting as a promising approach for low-resource sarcasm detection
Abstract
Sarcasm detection is a significant challenge in sentiment analysis, particularly due to its nature of conveying opinions where the intended meaning deviates from the literal expression. This challenge is heightened in social media contexts where code-mixing, especially in Dravidian languages, is prevalent. Code-mixing involves the blending of multiple languages within a single utterance, often with non-native scripts, complicating the task for systems trained on monolingual data. This shared task introduces a novel gold standard corpus designed for sarcasm and sentiment detection within code-mixed texts, specifically in Tamil-English and Malayalam-English languages. The primary objective of this task is to identify sarcasm and sentiment polarity within a code-mixed dataset of Tamil-English and Malayalam-English comments and posts collected from social media platforms. Each comment or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Hate Speech and Cyberbullying Detection
MethodsAttention Is All You Need · Cosine Annealing · Layer Normalization · Adam · Attention Dropout · Linear Layer · 15 Ways to Contact How can i speak to someone at Delta Airlines · Weight Decay · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention
