Crowd Score: A Method for the Evaluation of Jokes using Large Language Model AI Voters as Judges
Fabricio Goes, Zisen Zhou, Piotr Sawicki, Marek Grzes, Daniel G., Brown

TL;DR
The paper introduces Crowd Score, a method using large language models as AI judges to evaluate joke funniness, demonstrating alignment with human judgments and potential applications in creative domains.
Contribution
It presents a novel AI-based scoring method for jokes using personality-induced LLMs, validated through explanation auditing and tested on diverse humor types.
Findings
Few-shot prompting improves voting accuracy.
Personality induction affects joke ratings, with aggressive/self-defeating voters favoring certain jokes.
Crowd Score aligns with human judgments on joke funniness.
Abstract
This paper presents the Crowd Score, a novel method to assess the funniness of jokes using large language models (LLMs) as AI judges. Our method relies on inducing different personalities into the LLM and aggregating the votes of the AI judges into a single score to rate jokes. We validate the votes using an auditing technique that checks if the explanation for a particular vote is reasonable using the LLM. We tested our methodology on 52 jokes in a crowd of four AI voters with different humour types: affiliative, self-enhancing, aggressive and self-defeating. Our results show that few-shot prompting leads to better results than zero-shot for the voting question. Personality induction showed that aggressive and self-defeating voters are significantly more inclined to find more jokes funny of a set of aggressive/self-defeating jokes than the affiliative and self-enhancing voters. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Computational and Text Analysis Methods
