Do Large Language Models Understand Conversational Implicature -- A case   study with a chinese sitcom

Shisen Yue; Siyuan Song; Xinyuan Cheng; Hai Hu

arXiv:2404.19509·cs.CL·August 1, 2024

Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom

Shisen Yue, Siyuan Song, Xinyuan Cheng, Hai Hu

PDF

Open Access 1 Repo

TL;DR

This study evaluates how well large language models understand conversational implicature in Chinese dialogues, introducing a new dataset and testing multiple models' abilities to interpret non-literal meanings.

Contribution

We created SwordsmanImp, the first Chinese multi-turn-dialogue dataset on conversational implicature, and systematically evaluated various LLMs' understanding and explanation capabilities.

Findings

01

GPT-4 achieves 94% accuracy on multiple-choice implicature questions.

02

Most models generate fluent explanations but lack reasonable, logical justifications.

03

LLMs' performance does not significantly differ across different Gricean maxims.

Abstract

Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce SwordsmanImp, the first Chinese multi-turn-dialogue-based dataset aimed at conversational implicature, sourced from dialogues in the Chinese sitcom $My Own Swordsman$ . It includes 200 carefully handcrafted questions, all annotated on which Gricean maxims have been violated. We test eight close-source and open-source LLMs under two tasks: a multiple-choice question task and an implicature explanation task. Our results show that GPT-4 attains human-level accuracy (94%) on multiple-choice questions. CausalLM demonstrates a 78.5% accuracy following GPT-4. Other models, including GPT-3.5 and several open-source models, demonstrate a lower accuracy ranging from 20% to 60% on multiple-choice questions. Human raters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sjtu-compling/llm-pragmatics
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Subtitles and Audiovisual Media

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Layer · Label Smoothing · Adam · Layer Normalization · Attention Dropout