Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling

Zhaoyan Li; Hang Lei; Yujia Wang; Lanbo Liu; Hao Liu; Liang Yu

arXiv:2601.07149·cs.AI·January 13, 2026

Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling

Zhaoyan Li, Hang Lei, Yujia Wang, Lanbo Liu, Hao Liu, Liang Yu

PDF

Open Access

TL;DR

This paper presents RLCS, a framework combining a novel Generative Reward Model and entropy-based reward shaping to improve reinforcement learning for creative storytelling, achieving higher alignment with human judgments and better story quality.

Contribution

It introduces a systematic approach to reward modeling and training stability in RL for storytelling, including a new reward model and dynamic reward shaping strategies.

Findings

01

GenRM achieves 68% alignment with human creativity judgments.

02

RLCS outperforms strong baselines like Gemini-2.5-Pro.

03

The proposed methods improve story quality and training stability.

Abstract

While Large Language Models (LLMs) can generate fluent text, producing high-quality creative stories remains challenging. Reinforcement Learning (RL) offers a promising solution but faces two critical obstacles: designing reliable reward signals for subjective storytelling quality and mitigating training instability. This paper introduces the Reinforcement Learning for Creative Storytelling (RLCS) framework to systematically address both challenges. First, we develop a Generative Reward Model (GenRM) that provides multi-dimensional analysis and explicit reasoning about story preferences, trained through supervised fine-tuning on demonstrations with reasoning chains distilled from strong teacher models, followed by GRPO-based refinement on expanded preference data. Second, we introduce an entropy-based reward shaping strategy that dynamically prioritizes learning on confident errors and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Topic Modeling · Creativity in Education and Neuroscience