Dark & Stormy: Modeling Humor in Sentences from the Bulwer-Lytton Fiction Contest
Venkata S Govindarajan, Laura Biester

TL;DR
This study introduces a new corpus of intentionally bad humor from the Bulwer-Lytton Fiction Contest, analyzing its features and evaluating humor detection models' performance on it.
Contribution
It provides a novel dataset of 'bad' humor, analyzes its literary devices, and assesses the limitations of existing humor detection models.
Findings
Standard humor detection models perform poorly on the corpus.
Contest sentences combine humor devices like puns and irony with metaphor and metafiction.
LLMs exaggerate literary devices and generate more novel adjective-noun bigrams than humans.
Abstract
Textual humor is enormously diverse and computational studies need to account for this range, including intentionally bad humor. In this paper, we curate and analyze a novel corpus of sentences from the Bulwer-Lytton Fiction Contest to better understand "bad" humor in English. Standard humor detection models perform poorly on our corpus, and an analysis of literary devices finds that these sentences combine features common in existing humor datasets (e.g., puns, irony) with metaphor, metafiction and simile. LLMs prompted to synthesize contest-style sentences imitate the form but exaggerate the effect by over-using certain literary devices, and including far more novel adjective-noun bigrams than human writers. Data, code and analysis are available at https://github.com/venkatasg/bulwer-lytton
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
