Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective

Yi Wang; Max Kreminski

arXiv:2506.10161·cs.CL·June 13, 2025

Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective

Yi Wang, Max Kreminski

PDF

Open Access

TL;DR

This paper evaluates the story generation capabilities of Large Language Models using narrative planning benchmarks, revealing strengths in causal soundness at small scales and challenges in complex reasoning involving character intent and conflict.

Contribution

It introduces a new benchmark for LLMs in narrative planning and analyzes their performance, providing insights into their abilities and limitations for story generation.

Findings

01

GPT-4 can generate causally sound stories at small scales

02

Planning for character intentionality and conflict remains challenging

03

Reinforcement learning improves complex reasoning in story generation

Abstract

Story generation has been a prominent application of Large Language Models (LLMs). However, understanding LLMs' ability to produce high-quality stories remains limited due to challenges in automatic evaluation methods and the high cost and subjectivity of manual evaluation. Computational narratology offers valuable insights into what constitutes a good story, which has been applied in the symbolic narrative planning approach to story generation. This work aims to deepen the understanding of LLMs' story generation capabilities by using them to solve narrative planning problems. We present a benchmark for evaluating LLMs on narrative planning based on literature examples, focusing on causal soundness, character intentionality, and dramatic conflict. Our experiments show that GPT-4 tier LLMs can generate causally sound stories at small scales, but planning with character intentionality and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Topic Modeling · Multimodal Machine Learning Applications

MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · GPT-4