Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
Tyler Loakman, William Thorne, Chenghua Lin

TL;DR
This paper evaluates the ability of Large Language Models to explain various joke types, revealing their limitations across simple puns and complex topical humour, and introduces a new diverse joke dataset.
Contribution
It provides a new dataset of 600 jokes across multiple types and analyzes LLMs' zero-shot joke explanation capabilities, highlighting significant research gaps.
Findings
LLMs struggle to reliably explain all joke types
Existing models perform poorly on complex topical humour
Most prior work focuses only on simple puns
Abstract
Humour, as a complex language form, is derived from myriad aspects of life. Whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular form. We compare models' joke explanation abilities from simple puns to complex topical humour that requires esoteric knowledge of real-world entities and events. To this end, we curate a dataset of 600 jokes across 4 joke types and manually write high-quality explanations. These jokes include heterographic and homographic puns, contemporary internet humour, and topical jokes. Using this dataset, we compare the zero-shot abilities of a range of LLMs to accurately and comprehensively explain jokes of different types, identifying key research gaps in the task of humour explanation. We find that none of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCulinary Culture and Tourism · Humor Studies and Applications
