CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity
Zhaoyi Joey Hou, Bowei Alvin Zhang, Yining Lu, Bhiman Kumar Baghel, Anneliese Brei, Ximing Lu, Meng Jiang, Faeze Brahman, Snigdha Chaturvedi, Haw-Shiuan Chang, Daniel Khashabi, Xiang Lorraine Li

TL;DR
CreativityPrism is a comprehensive, scalable framework for evaluating large language model creativity across multiple domains and dimensions, addressing limitations of existing fragmented and human-dependent methods.
Contribution
It introduces a holistic evaluation framework that consolidates diverse creative tasks and employs reliable automatic judges validated against human annotations.
Findings
Proprietary LLMs outperform open-source models in creative writing and reasoning.
No significant advantage of proprietary models in divergent thinking.
High performance in one creative dimension rarely correlates with others.
Abstract
Creativity is often seen as a hallmark of human intelligence. While large language models (LLMs) are increasingly perceived as generating creative text, there is still no holistic and scalable framework to evaluate their creativity across diverse scenarios. Existing methods of LLM creativity evaluation either heavily rely on humans, limiting speed and scalability, or are fragmented across different domains and different definitions of creativity. To address this gap, we propose CREATIVITYPRISM, an evaluation analysis framework that consolidates eight tasks from three domains, divergent thinking, creative writing, and logical reasoning, into a taxonomy of creativity that emphasizes three dimensions: quality, novelty, and diversity of LLM generations. The framework is designed to be scalable with reliable automatic evaluation judges that have been validated against human annotations. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCreativity in Education and Neuroscience · Artificial Intelligence in Games · Machine Learning in Materials Science
