Help Me Write a Story: Evaluating LLMs' Ability to Generate Writing Feedback

Hannah Rashkin; Elizabeth Clark; Fantine Huot; Mirella Lapata

arXiv:2507.16007·cs.CL·July 23, 2025

Help Me Write a Story: Evaluating LLMs' Ability to Generate Writing Feedback

Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata

PDF

Open Access 1 Video

TL;DR

This paper assesses how well large language models can provide meaningful writing feedback to creative writers, highlighting their strengths and limitations through a new dataset and evaluation framework.

Contribution

It introduces a novel dataset, task definition, and evaluation methods for assessing LLMs' ability to give writing feedback on corrupted stories.

Findings

01

Models provide mostly accurate and specific feedback

02

Models often miss the main writing issues

03

Models struggle to decide when to give critical versus positive feedback

Abstract

Can LLMs provide support to creative writers by giving meaningful writing feedback? In this paper, we explore the challenges and limitations of model-generated writing feedback by defining a new task, dataset, and evaluation frameworks. To study model performance in a controlled manner, we present a novel test set of 1,300 stories that we corrupted to intentionally introduce writing issues. We study the performance of commonly used LLMs in this task with both automatic and human evaluation metrics. Our analysis shows that current models have strong out-of-the-box behavior in many respects -- providing specific and mostly accurate writing feedback. However, models often fail to identify the biggest writing issue in the story and to correctly decide when to offer critical vs. positive feedback.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Help Me Write a Story: Evaluating LLMs’ Ability to Generate Writing Feedback· underline

Taxonomy

TopicsHigher Education Learning Practices