ChatGPT Reads Your Tone and Responds Accordingly -- Until It Does Not -- Emotional Framing Induces Bias in LLM Outputs
Franck Bardol

TL;DR
This study investigates how emotional tone in prompts influences GPT-4 responses, revealing biases like overcorrection and tone suppression, with implications for AI alignment and trust.
Contribution
It systematically analyzes emotional framing effects on GPT-4, introduces concepts like tone floor and tone-valence matrices, and visualizes semantic drift in responses.
Findings
GPT-4 is less likely to respond negatively to negatively framed questions.
Tone-based variation is suppressed on sensitive topics, indicating alignment override.
Emotional framing induces measurable semantic drift in model responses.
Abstract
Large Language Models like GPT-4 adjust their responses not only based on the question asked, but also on how it is emotionally phrased. We systematically vary the emotional tone of 156 prompts - spanning controversial and everyday topics - and analyze how it affects model responses. Our findings show that GPT-4 is three times less likely to respond negatively to a negatively framed question than to a neutral one. This suggests a "rebound" bias where the model overcorrects, often shifting toward neutrality or positivity. On sensitive topics (e.g., justice or politics), this effect is even more pronounced: tone-based variation is suppressed, suggesting an alignment override. We introduce concepts like the "tone floor" - a lower bound in response negativity - and use tone-valence transition matrices to quantify behavior. Visualizations based on 1536-dimensional embeddings confirm semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
