Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?

Guan-Ting Lin; Hung-yi Lee

arXiv:2406.11065·cs.CL·October 1, 2024

Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?

Guan-Ting Lin, Hung-yi Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Emphasized-Talk, a benchmark for evaluating LLMs' ability to understand emphasis in dialogue, revealing that current models perform reasonably but still need improvement in grasping implied meanings.

Contribution

The paper presents a new benchmark with emphasis-annotated dialogue samples and an automatic GPT-4 based evaluation pipeline for assessing LLMs' understanding of emphasis.

Findings

01

Commercial LLMs outperform open-source models

02

Current models show limited understanding of emphasis implications

03

GPT-4 based evaluation correlates well with human judgment

Abstract

Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. While Large Language Models (LLMs) have revolutionized natural language processing, their ability to understand emphasis in dialogue remains unclear. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various LLMs, both open-source and commercial, to measure their performance in understanding emphasis. Additionally, we propose an automatic evaluation pipeline using GPT-4, which achieves a high correlation with human rating. Our findings reveal that although commercial LLMs generally perform better, there is still significant room for improvement in comprehending emphasized sentences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DanielLin94144/Emphasized-Talk
noneOfficial

Videos

Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?· underline

Taxonomy

TopicsInterpreting and Communication in Healthcare

MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer