Oogiri-Master: Benchmarking Humor Understanding via Oogiri
Soichiro Murakami, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

TL;DR
This paper introduces Oogiri-Master and Oogiri-Corpus, new benchmarks for evaluating humor understanding in large language models using Japanese creative responses, with extensive human ratings and linguistic analysis.
Contribution
It provides a comprehensive dataset and benchmark for humor, along with objective metrics and analysis, enabling rigorous evaluation of LLMs' humor comprehension capabilities.
Findings
State-of-the-art models approach human performance in humor understanding.
Insight-augmented prompting improves model performance.
Linguistic factors like ambiguity and incongruity influence funniness.
Abstract
Humor is a salient testbed for human-like creative thinking in large language models (LLMs). We study humor using the Japanese creative response game Oogiri, in which participants produce witty responses to a given prompt, and ask the following research question: What makes such responses funny to humans? Previous work has offered only limited reliable means to answer this question. Existing datasets contain few candidate responses per prompt, expose popularity signals during ratings, and lack objective and comparable metrics for funniness. Thus, we introduce Oogiri-Master and Oogiri-Corpus, which are a benchmark and dataset designed to enable rigorous evaluation of humor understanding in LLMs. Each prompt is paired with approximately 100 diverse candidate responses, and funniness is rated independently by approximately 100 human judges without access to others' ratings, reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHumor Studies and Applications · Language, Metaphor, and Cognition · Multisensory perception and integration
