Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning

Mingyang Song; Mao Zheng

arXiv:2505.21178·cs.CL·May 28, 2025

Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning

Mingyang Song, Mao Zheng

PDF

Open Access 2 Models

TL;DR

This paper introduces ConciseR, a two-stage reinforcement learning framework that enhances the conciseness and reasoning efficiency of large language models' responses, outperforming existing models on multiple reasoning benchmarks.

Contribution

The paper proposes a novel two-stage RL approach, ConciseR, which enforces response conciseness in LLMs while maintaining reasoning quality, using a walk-before-you-run strategy.

Findings

01

ConciseR generates more concise reasoning responses.

02

Outperforms recent state-of-the-art models on multiple benchmarks.

03

Effective in reducing overthinking and redundancy in LLM reasoning.

Abstract

As test-time scaling becomes a pivotal research frontier in Large Language Models (LLMs) development, contemporary and advanced post-training methodologies increasingly focus on extending the generation length of long Chain-of-Thought (CoT) responses to enhance reasoning capabilities toward DeepSeek R1-like performance. However, recent studies reveal a persistent overthinking phenomenon in state-of-the-art reasoning models, manifesting as excessive redundancy or repetitive thinking patterns in long CoT responses. To address this issue, in this paper, we propose a simple yet effective two-stage reinforcement learning framework for achieving concise reasoning in LLMs, named ConciseR. Specifically, the first stage, using more training steps, aims to incentivize the model's reasoning capabilities via Group Relative Policy Optimization with clip-higher and dynamic sampling components…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation

MethodsFocus