StoryAlign: Evaluating and Training Reward Models for Story Generation

Haotian Xia; Hao Peng; Yunjia Qi; Xiaozhi Wang; Bin Xu; Lei Hou; Juanzi Li

arXiv:2605.04831·cs.CL·May 7, 2026

StoryAlign: Evaluating and Training Reward Models for Story Generation

Haotian Xia, Hao Peng, Yunjia Qi, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

PDF

1 Repo 1 Video

TL;DR

This paper introduces StoryAlign, a benchmark and reward model for evaluating and improving story generation aligned with human preferences, demonstrating significant advancements in modeling subjective narrative quality.

Contribution

It presents the first benchmark for reward model evaluation on stories and develops a new reward model that outperforms larger models in aligning with human preferences.

Findings

01

Reward models struggle to match human story preferences, with the best achieving only 66.3% accuracy.

02

Constructed a large dataset of 100,000 story preference pairs across diverse domains.

03

StoryReward outperforms larger models and improves story selection aligned with human preferences.

Abstract

Story generation aims to automatically produce coherent, structured, and engaging narratives. Although large language models (LLMs) have significantly advanced text generation, stories generated by LLMs still diverge from human-authored works regarding complex narrative structure and human-aligned preferences. A key reason is the absence of effective modeling of human story preferences, which are inherently subjective and under-explored. In this work, we systematically evaluate the modeling of human story preferences and introduce StoryRMB, the first benchmark for assessing reward models on story preferences. StoryRMB contains $1, 133$ high-quality, human-verified instances, each consisting of a prompt, one chosen story, and three rejected stories. We find existing reward models struggle to select human-preferred stories, with the best model achieving only $66.3%$ accuracy. To address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

THU-KEG/StoryReward
github

Videos

StoryAlign: Evaluating and Training Reward Models for Story Generation· slideslive