Harnessing the Power of Large Language Models for Empathetic Response Generation: Empirical Investigations and Improvements
Yushan Qian, Wei-Nan Zhang, Ting Liu

TL;DR
This paper empirically evaluates large language models like ChatGPT for empathetic response generation, proposing three methods to enhance performance and exploring GPT-4's potential to simulate human evaluators.
Contribution
It introduces three novel improvement techniques for LLMs in empathetic dialogue generation and demonstrates their effectiveness through extensive experiments.
Findings
LLMs significantly improve with proposed methods
Achieved state-of-the-art results in automatic and human evaluations
GPT-4 can effectively simulate human evaluators
Abstract
Empathetic dialogue is an indispensable part of building harmonious social relationships and contributes to the development of a helpful AI. Previous approaches are mainly based on fine small-scale language models. With the advent of ChatGPT, the application effect of large language models (LLMs) in this field has attracted great attention. This work empirically investigates the performance of LLMs in generating empathetic responses and proposes three improvement methods of semantically similar in-context learning, two-stage interactive generation, and combination with the knowledge base. Extensive experiments show that LLMs can significantly benefit from our proposed methods and is able to achieve state-of-the-art performance in both automatic and human evaluations. Additionally, we explore the possibility of GPT-4 simulating human evaluators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection
