Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs
Hongyuan Yuan, Xinran He, Run Shao, Bolei He, Xianwei Xue, Mengke Chen, Qiutong Pan, Haiwei Wang, and Haifeng Li

TL;DR
This paper introduces a graph-based framework to optimize chain-of-thought reasoning in large language models, significantly reducing redundant reflections and improving efficiency without sacrificing accuracy.
Contribution
It proposes converting linear reasoning traces into DAGs and applying dual pruning strategies, combined with a three-stage training pipeline, to enhance reasoning efficiency.
Findings
Reduces reasoning tokens by 42% on average.
Maintains or improves reasoning accuracy.
Effectively eliminates redundant reflection patterns.
Abstract
Extending CoT through RL has been widely used to enhance the reasoning capabilities of LLMs. However, due to the sparsity of reward signals, it can also induce undesirable thinking patterns such as overthinking, i.e., generating redundant intermediate reasoning content. In this work, we argue that a major source of such redundancy is inefficient reflection, which often manifests in two problematic patterns: Indiscriminate Reflection, where the model performs broad, low-impact checks throughout reasoning, and Repetitive Reflection, where it repeatedly re-verifies an already established conclusion. To address this, we introduce a graph-based CoT optimization framework. Specifically, we convert each linear CoT into a directed acyclic graph (DAG) with explicit dependency edges, and design a dual pruning strategy: branch-level pruning removes weakly contributing reflection branches, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
