EnTaCs: Analyzing the Relationship Between Sentiment and Language Choice in English-Tamil Code-Switching

Paul Bontempo

arXiv:2603.26587·cs.CL·March 30, 2026

EnTaCs: Analyzing the Relationship Between Sentiment and Language Choice in English-Tamil Code-Switching

Paul Bontempo

PDF

TL;DR

This study explores how sentiment affects language choice in English-Tamil code-switching, using machine learning to analyze YouTube comments and revealing significant correlations between emotion and language switching behavior.

Contribution

It introduces a novel analysis linking sentiment to language switching patterns in code-switched text using a fine-tuned language identification model.

Findings

01

Positive utterances have higher English proportion (34.3%) than negative ones (24.8%)

02

Mixed-sentiment utterances show the highest language switch frequency

03

Emotion influences language choice due to socio-linguistic associations

Abstract

This paper investigates the relationship between utterance sentiment and language choice in English-Tamil code-switched text, using methods from machine learning and statistical modelling. We apply a fine-tuned XLM-RoBERTa model for token-level language identification on 35,650 romanized YouTube comments from the DravidianCodeMix dataset, producing per-utterance measurements of English proportion and language switch frequency. Linear regression analysis reveals that positive utterances exhibit significantly greater English proportion (34.3%) than negative utterances (24.8%), and mixed-sentiment utterances show the highest language switch frequency when controlling for utterance length. These findings support the hypothesis that emotional content demonstrably influences language choice in multilingual code-switching settings, due to socio-linguistic associations of prestige and identity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.