DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference

Xiang Liu; Xuming Hu; Xiaowen Chu; Eunsol Choi

arXiv:2510.19669·cs.CL·May 11, 2026

DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference

Xiang Liu, Xuming Hu, Xiaowen Chu, Eunsol Choi

PDF

1 Video

TL;DR

DiffAdapt introduces a difficulty-aware inference framework that adaptively selects reasoning strategies based on problem difficulty and entropy, significantly reducing token usage without sacrificing accuracy.

Contribution

It proposes a novel, lightweight method that classifies reasoning difficulty using entropy to improve token efficiency in LLM inference without fine-tuning.

Findings

01

Achieves up to 22.4% token reduction while maintaining accuracy.

02

Identifies a U-shaped entropy pattern across problem difficulties.

03

Demonstrates effectiveness across five models and eight benchmarks.

Abstract

Recent reasoning Large Language Models (LLMs) demonstrate remarkable problem-solving abilities but often generate long thinking traces whose utility is unclear. Our work aims to improve their efficiency, enabling them to reach high performance without overthinking. First, we analyze the entropy of token probabilities in reasoning traces. Across three models, we observe a consistent U-shaped entropy pattern: high entropy on easy problems despite high accuracy, low entropy on problems with medium difficulty, and high entropy on hard problems reflecting uncertainty. Specifically, we notice 22--25\% entropy reduction from easy to medium difficulty regions, suggesting an {overthinking} phenomenon on easy instances. Building on these insights, we introduce \textbf{DiffAdapt}, a lightweight framework that selects Easy/Normal/Hard inference strategies per question based on their difficulty and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference· slideslive