AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control

Ruosen Li; Ziming Luo; Quan Zhang; Ruochen Li; Ben Zhou; Ali Payani; Xinya Du

arXiv:2506.20160·cs.CL·August 11, 2025

AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control

Ruosen Li, Ziming Luo, Quan Zhang, Ruochen Li, Ben Zhou, Ali Payani, Xinya Du

PDF

Open Access 1 Repo

TL;DR

AALC introduces an adaptive reward mechanism for large reasoning models that reduces reasoning length by over 50% without sacrificing accuracy, leading to more efficient and structurally refined outputs.

Contribution

This work presents a novel reinforcement learning approach with an adaptive accuracy-length reward to improve reasoning efficiency in large models.

Findings

01

Reduces response length by over 50% while maintaining accuracy.

02

Curbs redundant reasoning patterns like excessive subgoal setting.

03

Efficiency gains lead to reduced interpretability and less explanatory context.

Abstract

Large reasoning models (LRMs) achieve impressive reasoning capabilities by generating lengthy chain-of-thoughts, but this "overthinking" incurs high latency and cost without commensurate accuracy gains. In this work, we introduce AALC, a lightweight, accuracy-aware length reward integrated into reinforcement learning that dynamically balances correctness and brevity during training. By incorporating validation accuracy into the reward and employing a smooth, dynamically scheduled length penalty, AALC delays length penalty until target performance is met. Through extensive experiments across standard and out-of-distribution math benchmarks, we show that our approach reduces response length by over 50% while maintaining or even improving the original accuracy. Furthermore, qualitative analysis reveals that our method curbs redundant reasoning patterns such as excessive subgoal setting and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

du-nlp-lab/lengthreward
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Data Quality and Management