Evaluating Large language models on Understanding Korean indirect Speech   acts

Youngeun Koo; Jiwoo Lee; Dojun Park; Seohyun Park; and Sungeun Lee

arXiv:2502.10995·cs.CL·February 18, 2025

Evaluating Large language models on Understanding Korean indirect Speech acts

Youngeun Koo, Jiwoo Lee, Dojun Park, Seohyun Park, and Sungeun Lee

PDF

Open Access

TL;DR

This paper evaluates the ability of large language models to understand indirect speech acts in Korean, revealing that while some models perform well, none match human comprehension, highlighting the need for further research.

Contribution

The study provides a comprehensive evaluation of LLMs' understanding of indirect speech acts in Korean, emphasizing performance gaps and the superiority of Claude3-Opus among tested models.

Findings

01

Claude3-Opus outperformed other models with 71.94% MCQ accuracy.

02

Proprietary models generally performed better than open-source models.

03

No LLM matched human performance in understanding indirect speech acts.

Abstract

To accurately understand the intention of an utterance is crucial in conversational communication. As conversational artificial intelligence models are rapidly being developed and applied in various fields, it is important to evaluate the LLMs' capabilities of understanding the intentions of user's utterance. This study evaluates whether current LLMs can understand the intention of an utterance by considering the given conversational context, particularly in cases where the actual intention differs from the surface-leveled, literal intention of the sentence, i.e. indirect speech acts. Our findings reveal that Claude3-Opus outperformed the other competing models, with 71.94% in MCQ and 65% in OEQ, showing a clear advantage. In general, proprietary models exhibited relatively higher performance compared to open-source models. Nevertheless, no LLMs reached the level of human performance.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis