CleanComedy: Creating Friendly Humor through Generative Techniques

Dmitry Vikhorev; Daria Galimzianova; Svetlana Gorovaia; Elizaveta; Zhemchuzhina; Ivan P. Yamshchikov

arXiv:2412.09203·cs.CL·December 13, 2024

CleanComedy: Creating Friendly Humor through Generative Techniques

Dmitry Vikhorev, Daria Galimzianova, Svetlana Gorovaia, Elizaveta, Zhemchuzhina, Ivan P. Yamshchikov

PDF

Open Access 2 Datasets

TL;DR

This paper introduces CleanComedy, a new toxicity-filtered humor dataset in English and Russian, and evaluates its effectiveness in improving humor generation models while addressing toxicity issues.

Contribution

The paper presents a novel, partially annotated, toxicity-filtered humor corpus and assesses its impact on generating safer, higher-quality jokes using generative models.

Findings

01

CleanComedy dataset reduces toxicity in generated jokes.

02

Models trained on CleanComedy produce more humorous and less toxic jokes.

03

Survey confirms effectiveness of data filtering in humor quality improvement.

Abstract

Humor generation is a challenging task in natural language processing due to limited resources and the quality of existing datasets. Available humor language resources often suffer from toxicity and duplication, limiting their effectiveness for training robust models. This paper proposes CleanComedy, a specialized, partially annotated toxicity-filtered corpus of English and Russian jokes collected from various sources. We study the effectiveness of our data filtering approach through a survey on humor and toxicity levels in various joke groups. In addition, we study advances in computer humor generation by comparing jokes written by humans with various groups of generative jokes, including our baseline models trained on the CleanComedy datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHumor Studies and Applications · Comics and Graphic Narratives