Bridging the Creativity Understanding Gap: Small-Scale Human Alignment   Enables Expert-Level Humor Ranking in LLMs

Kuan Lok Zhou; Jiayi Chen; Siddharth Suresh; Reuben Narad; Timothy T.; Rogers; Lalit K Jain; Robert D Nowak; Bob Mankoff; Jifan Zhang

arXiv:2502.20356·cs.CL·February 28, 2025

Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs

Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T., Rogers, Lalit K Jain, Robert D Nowak, Bob Mankoff, Jifan Zhang

PDF

Open Access 1 Video

TL;DR

This paper improves LLMs' humor understanding by decomposing the task, enhancing visual and reasoning components, and aligning with human preferences, achieving expert-level caption ranking accuracy.

Contribution

It introduces a systematic approach to enhance humor comprehension in LLMs through component-wise improvements and targeted alignment with human preferences.

Findings

01

Achieved 82.4% accuracy in caption ranking, surpassing previous benchmarks.

02

Model finetuning with crowd preferences significantly improved performance.

03

Minimal impact from persona prompts on subgroup preference mimicry.

Abstract

Large Language Models (LLMs) have shown significant limitations in understanding creative content, as demonstrated by Hessel et al. (2023)'s influential work on the New Yorker Cartoon Caption Contest (NYCCC). Their study exposed a substantial gap between LLMs and humans in humor comprehension, establishing that understanding and evaluating creative content is key challenge in AI development. We revisit this challenge by decomposing humor understanding into three components and systematically improve each: enhancing visual understanding through improved annotation, utilizing LLM-generated humor reasoning and explanations, and implementing targeted alignment with human preference data. Our refined approach achieves 82.4% accuracy in caption ranking, singificantly improving upon the previous 67% benchmark and matching the performance of world-renowned human experts in this domain. Notably,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs· underline

Taxonomy

TopicsHumor Studies and Applications · Multimodal Machine Learning Applications · Language, Metaphor, and Cognition