EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations
Nizi Nazar, Ehsaneddin Asgari

TL;DR
This paper introduces a new framework and benchmark for evaluating and improving emotional intelligence in large language models, revealing current limitations and potential for targeted fine-tuning.
Contribution
It proposes a four-layer EI taxonomy, a multi-turn benchmark, and demonstrates the limited impact of instruction tuning on emotional appraisal capabilities.
Findings
Qwen2.5-Instruct outperforms other models on EI tasks
Fine-tuning improves only the Appraisal layer significantly
Current training paradigms do not fully enhance emotional reasoning
Abstract
Emotional Intelligence (EI) is a critical yet underexplored dimension in the development of human-aligned LLMs. To address this gap, we introduce a unified, psychologically grounded four-layer taxonomy of EI tailored for large language models (LLMs), encompassing emotional tracking, cause inference, appraisal, and emotionally appropriate response generation. Building on this framework, we present EICAP-Bench, a novel MCQ style multi-turn benchmark designed to evaluate EI capabilities in open-source LLMs across diverse linguistic and cultural contexts. We evaluate six LLMs: LLaMA3 (8B), LLaMA3-Instruct, Gemma (9B), Gemma-Instruct, Qwen2.5 (7B), and Qwen2.5-Instruct on EmoCap-Bench, identifying Qwen2.5-Instruct as the strongest baseline. To assess the potential for enhancing EI capabilities, we fine-tune both Qwen2.5-Base and Qwen2.5-Instruct using LoRA adapters on UltraChat (UC), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Deception detection and forensic psychology · Education and Communication Studies
