AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence

Kate M. Lubrano; Faisal Sayed; Ankita Rathod; Akshansh; Craver Corbyn Thomas-Smith; Mark E. Whiting; Karina Nguyen

arXiv:2605.21739·cs.AI·May 22, 2026

AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence

Kate M. Lubrano, Faisal Sayed, Ankita Rathod, Akshansh, Craver Corbyn Thomas-Smith, Mark E. Whiting, Karina Nguyen

PDF

TL;DR

AttuneBench is a new benchmark for evaluating emotional intelligence in large language models through real multi-turn conversations with detailed annotations.

Contribution

It introduces a framework for assessing multiple aspects of emotional intelligence in LLMs using genuine multi-turn interactions and turn-by-turn annotations.

Findings

01

Model rankings vary across emotion recognition and response quality tasks.

02

Preference prediction and response quality are more discriminative than emotion-label accuracy.

03

Emotionally intelligent behavior involves predicting user-specific responses in context.

Abstract

Emotional intelligence (EI), the ability to perceive, understand, and respond appropriately to others' emotional states, is central to human communication, and increasingly important to assess as LLMs assume conversational roles in everyday life. Existing EI benchmarks rely on synthetic prompts, single-turn cases, or third-party annotation. These approaches do not directly measure how models infer and respond to a participant's emotional state over the course of a real conversation. We introduce AttuneBench, a benchmark grounded in 200 genuine multi-turn human-model conversations in which participants conversed with anonymized LLMs and provided turn-by-turn annotations of their emotional state, the model's behavior, and their preferred responses. Across 11 evaluated models, we find that model rankings on emotion recognition, behavioral classification, preference prediction, and judged…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.