Chain-of-Thought Matters: Improving Long-Context Language Models with   Reasoning Path Supervision

Dawei Zhu; Xiyu Wei; Guangxiang Zhao; Wenhao Wu; Haosheng Zou; Junfeng; Ran; Xun Wang; Lin Sun; Xiangzheng Zhang; Sujian Li

arXiv:2502.20790·cs.CL·March 3, 2025

Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision

Dawei Zhu, Xiyu Wei, Guangxiang Zhao, Wenhao Wu, Haosheng Zou, Junfeng, Ran, Xun Wang, Lin Sun, Xiangzheng Zhang, Sujian Li

PDF

Open Access

TL;DR

This paper investigates the effectiveness of Chain-of-Thought prompting for long-context reasoning in large language models and introduces LongRePS, a framework that enhances reasoning path quality to improve long-context task performance.

Contribution

The paper demonstrates that Chain-of-Thought benefits extend to long-context tasks and introduces LongRePS, a process-supervised method for generating high-quality reasoning paths in such scenarios.

Findings

01

CoT benefits generalize to long-context tasks and increase with context length.

02

LongRePS significantly improves performance on long-context benchmarks.

03

The approach achieves notable gains over outcome supervision baselines.

Abstract

Recent advances in Large Language Models (LLMs) have highlighted the challenge of handling long-context tasks, where models need to reason over extensive input contexts to aggregate target information. While Chain-of-Thought (CoT) prompting has shown promise for multi-step reasoning, its effectiveness for long-context scenarios remains underexplored. Through systematic investigation across diverse tasks, we demonstrate that CoT's benefits generalize across most long-context scenarios and amplify with increasing context length. Motivated by this critical observation, we propose LongRePS, a process-supervised framework that teaches models to generate high-quality reasoning paths for enhanced long-context performance. Our framework incorporates a self-sampling mechanism to bootstrap reasoning paths and a novel quality assessment protocol specifically designed for long-context scenarios.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques