Within-Model vs Between-Prompt Variability in Large Language Models for Creative Tasks

Jennifer Haase; Jana Gonnermann-M\"uller; Paul H. P. Hanel; Nicolas Leins; Thomas Kosch; Jan Mendling; Sebastian Pokutta

arXiv:2601.21339·cs.AI·January 30, 2026

Within-Model vs Between-Prompt Variability in Large Language Models for Creative Tasks

Jennifer Haase, Jana Gonnermann-M\"uller, Paul H. P. Hanel, Nicolas Leins, Thomas Kosch, Jan Mendling, Sebastian Pokutta

PDF

Open Access

TL;DR

This study quantifies how prompts, model choice, and randomness influence large language model outputs in creative tasks, revealing prompts significantly impact quality while model choice and stochasticity mainly affect quantity.

Contribution

It provides a comprehensive analysis of output variability sources in LLMs, highlighting the relative influence of prompts, models, and sampling noise in creative tasks.

Findings

01

Prompts explain 36.43% of output quality variance.

02

Model choice explains 40.94% of output quality variance.

03

Within-LLM stochasticity accounts for 33.70% of output quantity variance.

Abstract

How much of LLM output variance is explained by prompts versus model choice versus stochasticity through sampling? We answer this by evaluating 12 LLMs on 10 creativity prompts with 100 samples each (N = 12,000). For output quality (originality), prompts explain 36.43% of variance, comparable to model choice (40.94%). But for output quantity (fluency), model choice (51.25%) and within-LLM variance (33.70%) dominate, with prompts explaining only 4.22%. Prompts are powerful levers for steering output quality, but given the substantial within-LLM variance (10-34%), single-sample evaluations risk conflating sampling noise with genuine prompt or model effects.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Creativity in Education and Neuroscience · Wikis in Education and Collaboration