Towards a Multidimensional Evaluation Framework for Empathetic Conversational Systems
Aravind Sesagiri Raamkumar, Siyuan Brandon Loh

TL;DR
This paper introduces a comprehensive, multidimensional framework for evaluating empathy in conversational systems, addressing limitations of existing methods by measuring empathy at structural, behavioral, and overall levels.
Contribution
It proposes a novel evaluation framework with three methods for assessing empathy in ECS, enhancing the accuracy and depth of empathy measurement.
Findings
Framework effectively measures empathy at multiple levels
Experiments validate the framework's usefulness with ECS and LLMs
Improves upon existing evaluation approaches for empathetic conversations
Abstract
Empathetic Conversational Systems (ECS) are built to respond empathetically to the user's emotions and sentiments, regardless of the application domain. Current ECS studies evaluation approaches are restricted to offline evaluation experiments primarily for gold standard comparison & benchmarking, and user evaluation studies for collecting human ratings on specific constructs. These methods are inadequate in measuring the actual quality of empathy in conversations. In this paper, we propose a multidimensional empathy evaluation framework with three new methods for measuring empathy at (i) structural level using three empathy-related dimensions, (ii) behavioral level using empathy behavioral types, and (iii) overall level using an empathy lexicon, thereby fortifying the evaluation process. Experiments were conducted with the state-of-the-art ECS models and large language models (LLMs) to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Techniques and Practices · Education and Critical Thinking Development · AI in Service Interactions
