ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models

Razvan-Gabriel Dumitru; Darius Peteleaza; Vikas Yadav; Liangming Pan

arXiv:2505.17250·cs.CL·May 26, 2025

ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models

Razvan-Gabriel Dumitru, Darius Peteleaza, Vikas Yadav, Liangming Pan

PDF

1 Repo

TL;DR

ConciseRL introduces a hyperparameter-free conciseness score as a reward in reinforcement learning to guide large language models toward generating correct, concise reasoning traces, significantly improving efficiency and accuracy across multiple datasets.

Contribution

The paper presents a novel conciseness score used as a reward signal in reinforcement learning, enabling dynamic, context-aware guidance for reasoning models to produce concise and accurate outputs.

Findings

01

Reduces token usage by up to 31x on simple problems.

02

Improves accuracy by 7% on the MATH dataset.

03

Outperforms full reasoning by +7.5% accuracy on hardest problems.

Abstract

Large language models excel at complex tasks by breaking down problems into structured reasoning steps. However, reasoning traces often extend beyond reaching a correct answer, causing wasted computation, reduced readability, and hallucinations. To address this, we introduce a novel hyperparameter-free conciseness score used as a reward signal within a reinforcement learning framework to guide models toward generating correct and concise reasoning traces. This score is evaluated by a large language model acting as a judge, enabling dynamic, context-aware feedback beyond simple token length. Our method achieves state-of-the-art efficiency-accuracy trade-offs on the MATH dataset, reducing token usage by up to 31x on simple problems while improving accuracy by 7%, and on the hardest problems, it outperforms full reasoning by +7.5% accuracy with up to 3.6x fewer tokens. On TheoremQA, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

razvandu/conciserl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.