Socio-Emotional Response Generation: A Human Evaluation Protocol for   LLM-Based Conversational Systems

Lorraine Vanel; Ariel R. Ramos Vela; Alya Yacoubi; Chlo\'e Clavel; (IDS; S2A; LTCI)

arXiv:2412.04492·cs.CL·December 9, 2024

Socio-Emotional Response Generation: A Human Evaluation Protocol for LLM-Based Conversational Systems

Lorraine Vanel, Ariel R. Ramos Vela, Alya Yacoubi, Chlo\'e Clavel, (IDS, S2A, LTCI)

PDF

TL;DR

This paper introduces a new human evaluation protocol and a neural planning architecture for socio-emotional response generation in conversational AI, improving transparency and assessment of emotional strategies in LLMs.

Contribution

It proposes a novel planning module for socio-emotional strategies, compares augmented LLMs with baseline models, and develops a comprehensive human evaluation protocol.

Findings

01

Planning socio-emotional strategies improves response quality.

02

Current automated metrics often diverge from human judgments.

03

The proposed protocol provides detailed insights into social and emotional response quality.

Abstract

Conversational systems are now capable of producing impressive and generally relevant responses. However, we have no visibility nor control of the socio-emotional strategies behind state-of-the-art Large Language Models (LLMs), which poses a problem in terms of their transparency and thus their trustworthiness for critical applications. Another issue is that current automated metrics are not able to properly evaluate the quality of generated responses beyond the dataset's ground truth. In this paper, we propose a neural architecture that includes an intermediate step in planning socio-emotional strategies before response generation. We compare the performance of open-source baseline LLMs to the outputs of these same models augmented with our planning module. We also contrast the outputs obtained from automated metrics and evaluation results provided by human annotators. We describe a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.