Metaphor Understanding Challenge Dataset for LLMs
Xiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina, Shutova

TL;DR
The paper introduces MUNCH, a challenging dataset with paraphrased metaphorical sentences across genres, to evaluate and improve the metaphor understanding capabilities of large language models like LLaMA and GPT-3.5.
Contribution
It provides a novel, manually annotated dataset with apt and inapt paraphrases for assessing metaphor comprehension in LLMs across diverse genres.
Findings
LLMs find MUNCH a challenging benchmark
Models perform better on literal than metaphorical sentences
The dataset reveals gaps in current metaphor understanding capabilities
Abstract
Metaphors in natural language are a reflection of fundamental cognitive processes such as analogical reasoning and categorisation, and are deeply rooted in everyday communication. Metaphor understanding is therefore an essential task for large language models (LLMs). We release the Metaphor Understanding Challenge Dataset (MUNCH), designed to evaluate the metaphor understanding capabilities of LLMs. The dataset provides over 10k paraphrases for sentences containing metaphor use, as well as 1.5k instances containing inapt paraphrases. The inapt paraphrases were carefully selected to serve as control to determine whether the model indeed performs full metaphor interpretation or rather resorts to lexical similarity. All apt and inapt paraphrases were manually annotated. The metaphorical sentences cover natural metaphor uses across 4 genres (academic, news, fiction, and conversation), and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Multi-Head Attention · Cosine Annealing · Dropout
