Harnessing the Power of Large Language Models for Empathetic Response   Generation: Empirical Investigations and Improvements

Yushan Qian; Wei-Nan Zhang; Ting Liu

arXiv:2310.05140·cs.CL·July 29, 2024

Harnessing the Power of Large Language Models for Empathetic Response Generation: Empirical Investigations and Improvements

Yushan Qian, Wei-Nan Zhang, Ting Liu

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates large language models like ChatGPT for empathetic response generation, proposing three methods to enhance performance and exploring GPT-4's potential to simulate human evaluators.

Contribution

It introduces three novel improvement techniques for LLMs in empathetic dialogue generation and demonstrates their effectiveness through extensive experiments.

Findings

01

LLMs significantly improve with proposed methods

02

Achieved state-of-the-art results in automatic and human evaluations

03

GPT-4 can effectively simulate human evaluators

Abstract

Empathetic dialogue is an indispensable part of building harmonious social relationships and contributes to the development of a helpful AI. Previous approaches are mainly based on fine small-scale language models. With the advent of ChatGPT, the application effect of large language models (LLMs) in this field has attracted great attention. This work empirically investigates the performance of LLMs in generating empathetic responses and proposes three improvement methods of semantically similar in-context learning, two-stage interactive generation, and combination with the knowledge base. Extensive experiments show that LLMs can significantly benefit from our proposed methods and is able to achieve state-of-the-art performance in both automatic and human evaluations. Additionally, we explore the possibility of GPT-4 simulating human evaluators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

27182812/LLM4ED
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection