AEQ-Bench: Measuring Empathy of Omni-Modal Large Models

Xuan Luo; Lewei Yao; Libo Zhao; Lanqing Hong; Kai Chen; Dehua Tao; Daxin Tan; Ruifeng Xu; Jing Li

arXiv:2601.10513·cs.CL·January 16, 2026

AEQ-Bench: Measuring Empathy of Omni-Modal Large Models

Xuan Luo, Lewei Yao, Libo Zhao, Lanqing Hong, Kai Chen, Dehua Tao, Daxin Tan, Ruifeng Xu, Jing Li

PDF

Open Access

TL;DR

AEQ-Bench is a new benchmark designed to evaluate the empathy of omni-modal large models in generating and judging empathetic responses using audio and text cues, revealing strengths and limitations in current models.

Contribution

The paper introduces AEQ-Bench, the first benchmark specifically targeting empathy assessment in omni-modal large models with novel multi-modal evaluation settings.

Findings

01

Audio-capable models outperform text-only models in empathy tasks.

02

Models align with human judgments on coarse empathy but struggle with fine-grained paralinguistic cues.

Abstract

While the automatic evaluation of omni-modal large models (OLMs) is essential, assessing empathy remains a significant challenge due to its inherent affectivity. To investigate this challenge, we introduce AEQ-Bench (Audio Empathy Quotient Benchmark), a novel benchmark to systematically assess two core empathetic capabilities of OLMs: (i) generating empathetic responses by comprehending affective cues from multi-modal inputs (audio + text), and (ii) judging the empathy of audio responses without relying on text transcription. Compared to existing benchmarks, AEQ-Bench incorporates two novel settings that vary in context specificity and speech tone. Comprehensive assessment across linguistic and paralinguistic metrics reveals that (1) OLMs trained with audio output capabilities generally outperformed models with text-only outputs, and (2) while OLMs align with human judgments for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Multimodal Machine Learning Applications · Music and Audio Processing