Crowd Score: A Method for the Evaluation of Jokes using Large Language   Model AI Voters as Judges

Fabricio Goes; Zisen Zhou; Piotr Sawicki; Marek Grzes; Daniel G.; Brown

arXiv:2212.11214·cs.AI·December 22, 2022·1 cites

Crowd Score: A Method for the Evaluation of Jokes using Large Language Model AI Voters as Judges

Fabricio Goes, Zisen Zhou, Piotr Sawicki, Marek Grzes, Daniel G., Brown

PDF

Open Access 1 Repo

TL;DR

The paper introduces Crowd Score, a method using large language models as AI judges to evaluate joke funniness, demonstrating alignment with human judgments and potential applications in creative domains.

Contribution

It presents a novel AI-based scoring method for jokes using personality-induced LLMs, validated through explanation auditing and tested on diverse humor types.

Findings

01

Few-shot prompting improves voting accuracy.

02

Personality induction affects joke ratings, with aggressive/self-defeating voters favoring certain jokes.

03

Crowd Score aligns with human judgments on joke funniness.

Abstract

This paper presents the Crowd Score, a novel method to assess the funniness of jokes using large language models (LLMs) as AI judges. Our method relies on inducing different personalities into the LLM and aggregating the votes of the AI judges into a single score to rate jokes. We validate the votes using an auditing technique that checks if the explanation for a particular vote is reasonable using the LLM. We tested our methodology on 52 jokes in a crowd of four AI voters with different humour types: affiliative, self-enhancing, aggressive and self-defeating. Our results show that few-shot prompting leads to better results than zero-shot for the voting question. Personality induction showed that aggressive and self-defeating voters are significantly more inclined to find more jokes funny of a set of aggressive/self-defeating jokes than the affiliative and self-enhancing voters. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

creapar/crowdscore
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Computational and Text Analysis Methods