Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs
Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T., Rogers, Lalit K Jain, Robert D Nowak, Bob Mankoff, Jifan Zhang

TL;DR
This paper improves LLMs' humor understanding by decomposing the task, enhancing visual and reasoning components, and aligning with human preferences, achieving expert-level caption ranking accuracy.
Contribution
It introduces a systematic approach to enhance humor comprehension in LLMs through component-wise improvements and targeted alignment with human preferences.
Findings
Achieved 82.4% accuracy in caption ranking, surpassing previous benchmarks.
Model finetuning with crowd preferences significantly improved performance.
Minimal impact from persona prompts on subgroup preference mimicry.
Abstract
Large Language Models (LLMs) have shown significant limitations in understanding creative content, as demonstrated by Hessel et al. (2023)'s influential work on the New Yorker Cartoon Caption Contest (NYCCC). Their study exposed a substantial gap between LLMs and humans in humor comprehension, establishing that understanding and evaluating creative content is key challenge in AI development. We revisit this challenge by decomposing humor understanding into three components and systematically improve each: enhancing visual understanding through improved annotation, utilizing LLM-generated humor reasoning and explanations, and implementing targeted alignment with human preference data. Our refined approach achieves 82.4% accuracy in caption ranking, singificantly improving upon the previous 67% benchmark and matching the performance of world-renowned human experts in this domain. Notably,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHumor Studies and Applications · Multimodal Machine Learning Applications · Language, Metaphor, and Cognition
