Document Reconstruction Unlocks Scalable Long-Context RLVR

Yao Xiao; Lei Wang; Yue Deng; Guanzheng Chen; Ziqi Jin; Jung-jae Kim; Xiaoli Li; Roy Ka-wei Lee; Lidong Bing

arXiv:2602.08237·cs.CL·March 3, 2026

Document Reconstruction Unlocks Scalable Long-Context RLVR

Yao Xiao, Lei Wang, Yue Deng, Guanzheng Chen, Ziqi Jin, Jung-jae Kim, Xiaoli Li, Roy Ka-wei Lee, Lidong Bing

PDF

Open Access

TL;DR

This paper introduces an unsupervised document reconstruction method that improves the long-context understanding of large language models by training them to reconstruct missing document parts, reducing reliance on costly annotations.

Contribution

It proposes a novel unsupervised reinforcement learning approach for long-context modeling by reconstructing documents, enhancing LLMs' global coherence without heavy supervision.

Findings

01

Significant performance gains on RULER benchmark.

02

Reasonable improvements on LongBench v2 without manual QA data.

03

Extensive ablation studies validate the method's effectiveness.

Abstract

Reinforcement Learning with Verifiable Rewards~(RLVR) has become a prominent paradigm to enhance the capabilities (i.e.\ long-context) of Large Language Models~(LLMs). However, it often relies on gold-standard answers or explicit evaluation rubrics provided by powerful teacher models or human experts, which are costly and time-consuming. In this work, we investigate unsupervised approaches to enhance the long-context capabilities of LLMs, eliminating the need for heavy human annotations or teacher models' supervision. Specifically, we first replace a few paragraphs with special placeholders in a long document. LLMs are trained through reinforcement learning to reconstruct the document by correctly identifying and sequencing missing paragraphs from a set of candidate options. This training paradigm enables the model to capture global narrative coherence, significantly boosting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications