Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

Guillermo Marco; Julio Gonzalo; Ram\'on del Castillo; Mar\'ia Teresa Mateo Girona

arXiv:2407.01119·cs.CL·June 5, 2025·3 cites

Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

Guillermo Marco, Julio Gonzalo, Ram\'on del Castillo, Mar\'ia Teresa Mateo Girona

PDF

Open Access 1 Video

TL;DR

This study compares the creative writing abilities of GPT-4 and a top novelist, Patricio Pron, revealing that LLMs still cannot match top human creative writers despite their advancements.

Contribution

The paper presents a novel contest between a top novelist and GPT-4, providing empirical evidence that LLMs are not yet capable of surpassing top human creative writers.

Findings

01

GPT-4 does not outperform Patricio Pron in creative writing tasks.

02

Larger language models alone are insufficient to reach top human creative levels.

03

Manual evaluations show a significant gap between LLMs and top human authors.

Abstract

It has become routine to report research results where Large Language Models (LLMs) outperform average humans in a wide range of language-related tasks, and creative text writing is no exception. It seems natural, then, to raise the bid: Are LLMs ready to compete in creative writing skills with a top (rather than average) novelist? To provide an initial answer for this question, we have carried out a contest between Patricio Pron (an awarded novelist, considered one of the best of his generation) and GPT-4 (one of the top performing LLMs), in the spirit of AI-human duels such as DeepBlue vs Kasparov and AlphaGo vs Lee Sidol. We asked Pron and GPT-4 to provide thirty titles each, and then to write short stories for both their titles and their opponent's. Then, we prepared an evaluation rubric inspired by Boden's definition of creativity, and we collected 5,400 manual assessments provided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?· underline

Taxonomy

TopicsDigital Humanities and Scholarship · Topic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam