HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation

Edward Ajayi; Prasenjit Mitra

arXiv:2604.09629·cs.CL·April 14, 2026

HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation

Edward Ajayi, Prasenjit Mitra

PDF

TL;DR

HumorGen introduces a cognitive synergy framework using persona-based distillation and a mixture-of-thought approach to enhance humor generation in large language models, outperforming larger models and emphasizing data quality over scale.

Contribution

This work presents a novel cognitive-inspired methodology for humor data synthesis and fine-tuning of LLMs, demonstrating improved humor generation with a 7B model.

Findings

01

7B model outperforms larger instruction-tuned baselines

02

Cognitive-driven data curation is more critical than model size

03

The framework achieves competitive performance with state-of-the-art models

Abstract

Humor generation poses a significant challenge for Large Language Models (LLMs), because their standard training objective - predicting the most likely next word - inherently conflicts with the surprise and incongruity needed for comedy. To bridge this gap, we introduce the Cognitive Synergy Framework, a theoretically grounded methodology for generating high-quality humor data inspired by psychological theories of humor. Utilizing a Mixture-of-Thought (MoT) approach, we deploy six cognitive personas (e.g., The Absurdist, The Cynic) to synthesize diverse comedic perspectives for a given prompt. This framework creates a theoretically grounded dataset, which we use to fine-tune a 7B-parameter student model. We compare Direct Preference Optimization (DPO) and a novel Offline Group Relative Policy Optimization (O-GRPO); our 7B model significantly outperforms larger instruction-tuned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.