DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Qian Cao; Yahui Liu; Wei Bi; Yi Zhao; Ruihua Song; Xiting Wang; Ruiming Tang; Guorui Zhou; Han Li

arXiv:2601.09609·cs.CL·January 15, 2026

DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Qian Cao, Yahui Liu, Wei Bi, Yi Zhao, Ruihua Song, Xiting Wang, Ruiming Tang, Guorui Zhou, Han Li

PDF

Open Access

TL;DR

This paper introduces DPWriter, an RL framework for creative writing that enhances output diversity by explicitly planning intermediate steps and encouraging diverse trajectories, outperforming existing methods.

Contribution

It proposes a novel RL approach with diverse planning branching and a group-aware diversity reward to improve output diversity in creative writing tasks.

Findings

01

Significantly improves output diversity in creative writing benchmarks.

02

Maintains high quality of generated content.

03

Outperforms existing baseline methods.

Abstract

Reinforcement learning (RL)-based enhancement of large language models (LLMs) often leads to reduced output diversity, undermining their utility in open-ended tasks like creative writing. Current methods lack explicit mechanisms for guiding diverse exploration and instead prioritize optimization efficiency and performance over diversity. This paper proposes an RL framework structured around a semi-structured long Chain-of-Thought (CoT), in which the generation process is decomposed into explicitly planned intermediate steps. We introduce a Diverse Planning Branching method that strategically introduces divergence at the planning phase based on diversity variation, alongside a group-aware diversity reward to encourage distinct trajectories. Experimental results on creative writing benchmarks demonstrate that our approach significantly improves output diversity without compromising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Mobile Crowdsensing and Crowdsourcing · Topic Modeling