Reasoning Beyond Labels: Measuring LLM Sentiment in Low-Resource, Culturally Nuanced Contexts

Millicent Ochieng; Anja Thieme; Ignatius Ezeani; Risa Ueno; Samuel Maina; Keshet Ronen; Javier Gonzalez; Jacki O'Neill

arXiv:2508.04199·cs.CL·August 7, 2025

Reasoning Beyond Labels: Measuring LLM Sentiment in Low-Resource, Culturally Nuanced Contexts

Millicent Ochieng, Anja Thieme, Ignatius Ezeani, Risa Ueno, Samuel Maina, Keshet Ronen, Javier Gonzalez, Jacki O'Neill

PDF

TL;DR

This paper develops a framework to evaluate how large language models interpret sentiment in culturally nuanced, low-resource contexts, revealing significant variation in reasoning quality and emphasizing the importance of culturally sensitive AI evaluation.

Contribution

It introduces a diagnostic approach that treats sentiment as context-dependent and culturally embedded, and assesses LLM interpretability and robustness in informal, code-mixed communication.

Findings

01

Top-tier LLMs show interpretive stability in sentiment reasoning.

02

Open models often struggle with ambiguity and sentiment shifts.

03

Culturally sensitive evaluation is crucial for real-world NLP applications.

Abstract

Sentiment analysis in low-resource, culturally nuanced contexts challenges conventional NLP approaches that assume fixed labels and universal affective expressions. We present a diagnostic framework that treats sentiment as a context-dependent, culturally embedded construct, and evaluate how large language models (LLMs) reason about sentiment in informal, code-mixed WhatsApp messages from Nairobi youth health groups. Using a combination of human-annotated data, sentiment-flipped counterfactuals, and rubric-based explanation evaluation, we probe LLM interpretability, robustness, and alignment with human reasoning. Framing our evaluation through a social-science measurement lens, we operationalize and interrogate LLMs outputs as an instrument for measuring the abstract concept of sentiment. Our findings reveal significant variation in model reasoning quality, with top-tier LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.