Bias Beneath the Tone: Empirical Characterisation of Tone Bias in LLM-Driven UX Systems
Heet Bodara, Md Masum Mushfiq, Isma Farah Siddiqui

TL;DR
This paper investigates subtle tone biases in large language models used in conversational AI, revealing systematic biases through empirical analysis and proposing methods for detection and understanding their impact on user perception.
Contribution
It introduces a novel integration of controllable dialogue synthesis with tone classification, demonstrating systematic bias detection in LLM-driven systems.
Findings
Tone bias exists even in neutral prompts
Ensemble classifiers achieve macro F1 scores up to 0.92
Bias influences perceptions of trust and fairness
Abstract
Large Language Models are increasingly used in conversational systems such as digital personal assistants, shaping how people interact with technology through language. While their responses often sound fluent and natural, they can also carry subtle tone biases such as sounding overly polite, cheerful, or cautious even when neutrality is expected. These tendencies can influence how users perceive trust, empathy, and fairness in dialogue. In this study, we explore tone bias as a hidden behavioral trait of large language models. The novelty of this research lies in the integration of controllable large language model based dialogue synthesis with tone classification models, enabling robust and ethical emotion recognition in personal assistant interactions. We created two synthetic dialogue datasets, one generated from neutral prompts and another explicitly guided to produce positive or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Digital Mental Health Interventions · Topic Modeling
