Incorporating Self-Rewriting into Large Language Model Reasoning Reinforcement

Jiashu Yao; Heyan Huang; Shuang Zeng; Chuwei Luo; WangJie You; Jie Tang; Qingsong Liu; Yuhang Guo; Yangyang Kang

arXiv:2511.16331·cs.CL·November 21, 2025

Incorporating Self-Rewriting into Large Language Model Reasoning Reinforcement

Jiashu Yao, Heyan Huang, Shuang Zeng, Chuwei Luo, WangJie You, Jie Tang, Qingsong Liu, Yuhang Guo, Yangyang Kang

PDF

Open Access

TL;DR

This paper introduces a self-rewriting framework for large reasoning models that enhances internal reasoning quality and accuracy by rewriting and learning from its own reasoning texts, leading to more efficient and reliable reasoning processes.

Contribution

The paper proposes a novel self-rewriting approach for large reasoning models, improving internal reasoning quality and accuracy without significant computational overhead.

Findings

01

Improved accuracy (+0.6) and shorter reasoning (-46%)

02

Higher internal reasoning scores (+7.2)

03

Effective scalability with ~10% overhead

Abstract

Through reinforcement learning (RL) with outcome correctness rewards, large reasoning models (LRMs) with scaled inference computation have demonstrated substantial success on complex reasoning tasks. However, the one-sided reward, focused solely on final correctness, limits its ability to provide detailed supervision over internal reasoning process. This deficiency leads to suboptimal internal reasoning quality, manifesting as issues like over-thinking, under-thinking, redundant-thinking, and disordered-thinking. Inspired by the recent progress in LRM self-rewarding, we introduce self-rewriting framework, where a model rewrites its own reasoning texts, and subsequently learns from the rewritten reasoning to improve the internal thought process quality. For algorithm design, we propose a selective rewriting approach wherein only "simple" samples, defined by the model's consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques