A Hard Nut to Crack: Idiom Detection with Conversational Large Language   Models

Francesca De Luca Fornaciari; Bego\~na Altuna; Itziar Gonzalez-Dios,; Maite Melero

arXiv:2405.10579·cs.CL·May 20, 2024

A Hard Nut to Crack: Idiom Detection with Conversational Large Language Models

Francesca De Luca Fornaciari, Bego\~na Altuna, Itziar Gonzalez-Dios,, Maite Melero

PDF

Open Access

TL;DR

This paper introduces IdioTS, a challenging dataset for evaluating Large Language Models' ability to detect idiomatic expressions in sentences, along with a comprehensive evaluation methodology and detailed analysis.

Contribution

It presents a new dataset and evaluation framework specifically designed to assess LLMs' idiom detection capabilities at the sentence level.

Findings

01

LLMs show varying performance on idiom detection

02

Error analysis reveals common challenges in figurative language processing

03

IdioTS provides a benchmark for future idiomatic language processing research

Abstract

In this work, we explore idiomatic language processing with Large Language Models (LLMs). We introduce the Idiomatic language Test Suite IdioTS, a new dataset of difficult examples specifically designed by language experts to assess the capabilities of LLMs to process figurative language at sentence level. We propose a comprehensive evaluation methodology based on an idiom detection task, where LLMs are prompted with detecting an idiomatic expression in a given English sentence. We present a thorough automatic and manual evaluation of the results and an extensive error analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Translation Studies and Practices · Text Readability and Simplification