Not All Jokes Land: Evaluating Large Language Models Understanding of Workplace Humor

Mohammadamin Shafiei; Hamidreza Saffari

arXiv:2506.01819·cs.CL·June 9, 2025

Not All Jokes Land: Evaluating Large Language Models Understanding of Workplace Humor

Mohammadamin Shafiei, Hamidreza Saffari

PDF

Open Access

TL;DR

This paper investigates how well large language models understand workplace humor by creating a dataset and evaluating five models, revealing their struggles in judging humor appropriateness.

Contribution

The study introduces a new dataset of professional workplace humor and assesses LLMs' ability to evaluate humor appropriateness, highlighting gaps in current models.

Findings

01

LLMs often misjudge humor appropriateness in workplace contexts.

02

Current LLMs struggle with understanding professional humor.

03

The dataset enables better evaluation of humor understanding in AI.

Abstract

With the recent advances in Artificial Intelligence (AI) and Large Language Models (LLMs), the automation of daily tasks, like automatic writing, is getting more and more attention. Hence, efforts have focused on aligning LLMs with human values, yet humor, particularly professional industrial humor used in workplaces, has been largely neglected. To address this, we develop a dataset of professional humor statements along with features that determine the appropriateness of each statement. Our evaluation of five LLMs shows that LLMs often struggle to judge the appropriateness of humor accurately.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHumor Studies and Applications · Language, Metaphor, and Cognition · Language, Communication, and Linguistic Studies