Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning

Yansong Ning; Wei Li; Jun Fang; Naiqiang Tan; Hao Liu

arXiv:2505.11827·cs.CL·May 27, 2025

Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning

Yansong Ning, Wei Li, Jun Fang, Naiqiang Tan, Hao Liu

PDF

Open Access 1 Datasets

TL;DR

This paper introduces Long×Short, a collaborative reasoning framework with two LLMs focusing on important and remaining thoughts, significantly reducing token length while maintaining reasoning performance.

Contribution

It proposes a novel multi-turn reinforcement learning approach for LLM collaboration, emphasizing thought importance and efficiency in long chain-of-thought reasoning.

Findings

01

Achieves over 80% token reduction across multiple benchmarks.

02

Maintains comparable reasoning performance to larger models.

03

Demonstrates effective collaboration between long-thought and short-thought LLMs.

Abstract

Compressing long chain-of-thought (CoT) from large language models (LLMs) is an emerging strategy to improve the reasoning efficiency of LLMs. Despite its promising benefits, existing studies equally compress all thoughts within a long CoT, hindering more concise and effective reasoning. To this end, we first investigate the importance of different thoughts by examining their effectiveness and efficiency in contributing to reasoning through automatic long CoT chunking and Monte Carlo rollouts. Building upon the insights, we propose a theoretically bounded metric to jointly measure the effectiveness and efficiency of different thoughts. We then propose Long $\otimes$ Short, an efficient reasoning framework that enables two LLMs to collaboratively solve the problem: a long-thought LLM for more effectively generating important thoughts, while a short-thought LLM for efficiently generating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

yasNing/OpenMath-ThoughtChunk1.8K
dataset· 12 dl
12 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks