EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations

Nizi Nazar; Ehsaneddin Asgari

arXiv:2508.06196·cs.CL·August 11, 2025

EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations

Nizi Nazar, Ehsaneddin Asgari

PDF

Open Access

TL;DR

This paper introduces a new framework and benchmark for evaluating and improving emotional intelligence in large language models, revealing current limitations and potential for targeted fine-tuning.

Contribution

It proposes a four-layer EI taxonomy, a multi-turn benchmark, and demonstrates the limited impact of instruction tuning on emotional appraisal capabilities.

Findings

01

Qwen2.5-Instruct outperforms other models on EI tasks

02

Fine-tuning improves only the Appraisal layer significantly

03

Current training paradigms do not fully enhance emotional reasoning

Abstract

Emotional Intelligence (EI) is a critical yet underexplored dimension in the development of human-aligned LLMs. To address this gap, we introduce a unified, psychologically grounded four-layer taxonomy of EI tailored for large language models (LLMs), encompassing emotional tracking, cause inference, appraisal, and emotionally appropriate response generation. Building on this framework, we present EICAP-Bench, a novel MCQ style multi-turn benchmark designed to evaluate EI capabilities in open-source LLMs across diverse linguistic and cultural contexts. We evaluate six LLMs: LLaMA3 (8B), LLaMA3-Instruct, Gemma (9B), Gemma-Instruct, Qwen2.5 (7B), and Qwen2.5-Instruct on EmoCap-Bench, identifying Qwen2.5-Instruct as the strongest baseline. To assess the potential for enhancing EI capabilities, we fine-tune both Qwen2.5-Base and Qwen2.5-Instruct using LoRA adapters on UltraChat (UC), a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Deception detection and forensic psychology · Education and Communication Studies