The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

Weize Chen; Jiarui Yuan; Tailin Jin; Ning Ding; Huimin Chen; Zhiyuan Liu; Maosong Sun

arXiv:2505.19217·cs.CL·May 27, 2025

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

Weize Chen, Jiarui Yuan, Tailin Jin, Ning Ding, Huimin Chen, Zhiyuan Liu, Maosong Sun

PDF

Open Access

TL;DR

DIET introduces a difficulty-aware training framework that reduces token usage in large language models while improving reasoning performance, enabling better scaling and more efficient inference.

Contribution

The paper proposes DIET, a novel method integrating task difficulty into RL training to optimize token compression and model efficiency.

Findings

01

Significantly reduces token counts without sacrificing reasoning quality.

02

Enhances inference scaling by maintaining high per-sample quality with fewer tokens.

03

Preserves the positive correlation between response length and difficulty, improving verbosity management.

Abstract

Recent large language models (LLMs) exhibit impressive reasoning but often over-think, generating excessively long responses that hinder efficiency. We introduce DIET ( DIfficulty-AwarE Training), a framework that systematically cuts these "token calories" by integrating on-the-fly problem difficulty into the reinforcement learning (RL) process. DIET dynamically adapts token compression strategies by modulating token penalty strength and conditioning target lengths on estimated task difficulty, to optimize the performance-efficiency trade-off. We also theoretically analyze the pitfalls of naive reward weighting in group-normalized RL algorithms like GRPO, and propose Advantage Weighting technique, which enables stable and effective implementation of these difficulty-aware objectives. Experimental results demonstrate that DIET significantly reduces token counts while simultaneously…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)