Benchmarking Language Model Creativity: A Case Study on Code Generation
Yining Lu, Dixuan Wang, Tianjian Li, Dongwei Jiang, Sanjeev Khudanpur,, Meng Jiang, Daniel Khashabi

TL;DR
This paper introduces a framework to quantify creativity in large language models through a new prompting technique and a metric, revealing that even advanced models like GPT-4 lack human-like creativity in code generation.
Contribution
The paper presents DENIAL PROMPTING and NEOGAUGE, novel methods for enhancing and measuring creativity in LLMs, applied to coding tasks with a new dataset for future benchmarking.
Findings
GPT-4 still lags behind human creativity in coding tasks.
Advanced reasoning strategies do not significantly boost LLM creativity.
The NEOCODER dataset enables future reproducibility and benchmarking.
Abstract
As LLMs become increasingly prevalent, it is interesting to consider how ``creative'' these models can be. From cognitive science, creativity consists of at least two key characteristics: \emph{convergent} thinking (purposefulness to achieve a given goal) and \emph{divergent} thinking (adaptability to explore new environments or constraints) \citep{runco2003critical}. In this work, we introduce a framework for quantifying LLM creativity that incorporates the two design ingredients: (1) We introduce DENIAL PROMPTING which pushes LLMs to develop more creative solutions to a given problem by incrementally imposing new constraints on the previous solution, compelling LLMs to adopt new strategies. (2) We define NEOGAUGE, a metric that quantifies both convergent and divergent thinking in the generated creative responses by LLMs. We test the proposed framework on Codeforces problems, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software Engineering Research · Natural Language Processing Techniques
MethodsAttention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Adam · Dropout · Multi-Head Attention · Dense Connections
