CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints
Anirudh Atmakuru, Jatin Nainani, Rohith Siddhartha Reddy Bheemreddy,, Anirudh Lakkaraju, Zonghai Yao, Hamed Zamani, Haw-Shiuan Chang

TL;DR
This paper introduces CS4, a benchmark dataset that measures the creativity of large language models in story writing by controlling prompt constraints, revealing how different models balance creativity, instruction-following, and coherence.
Contribution
The paper presents a novel benchmark dataset, CS4, that assesses LLM creativity through prompt constraint variation, enabling indirect measurement without human annotations.
Findings
Different LLMs perform variably under various constraints.
Increasing constraints reduces models' ability to retell training data.
Learning from Human Feedback improves story selection but not creativity.
Abstract
Evaluating the creativity of large language models (LLMs) in story writing is difficult because LLM-generated stories could seemingly look creative but be very similar to some existing stories in their huge and proprietary training corpus. To overcome this challenge, we introduce a novel benchmark dataset with varying levels of prompt specificity: CS4 (omparing the kill of reating tories by ontrolling the ynthesized onstraint pecificity). By increasing the number of requirements/constraints in the prompt, we can increase the prompt specificity and hinder LLMs from retelling high-quality narratives in their training data. Consequently, CS4 empowers us to indirectly measure the LLMs' creativity without human annotations. Our experiments on LLaMA, Gemma, and Mistral not only highlight the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Artificial Intelligence in Games
MethodsLLaMA
