Logit-Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning

Mohammad Atif Quamar; Mohammad Areeb

arXiv:2511.04654·cs.CL·November 7, 2025

Logit-Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning

Mohammad Atif Quamar, Mohammad Areeb

PDF

Open Access

TL;DR

LEASH is a training-free, adaptive stopping heuristic for chain-of-thought reasoning in large language models that reduces token usage and latency with minimal accuracy loss.

Contribution

It introduces LEASH, a novel decoding algorithm that adaptively halts rationale generation based on entropy and logit signals, without additional training.

Findings

01

Reduces token generation by 30-35%

02

Lowers latency by 27%

03

Maintains comparable accuracy with minimal drop

Abstract

Chain-of-Thought (CoT) prompting is a key technique for enabling complex reasoning in large language models. However, generating full, fixed-length rationales is computationally wasteful, inflating both token usage and latency. We introduce LEASH: Logit-Entropy Adaptive Stopping Heuristic, a training-free decoding algorithm that adaptively halts rationale generation. LEASH monitors two intrinsic signals: the slope of token-level entropy and the improvement in the top-logit margin. It terminates the generation once both signals plateau, indicating the model has reached a stable reasoning state. Across four instruction-tuned models on the GSM8K and AQuA-RAT benchmarks, LEASH reduces average token generation by 30--35% and latency by 27%, while incurring a 10 p.p. accuracy drop relative to CoT. LEASH is model-agnostic and requires no additional training or supervision, offering a simple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Constraint Satisfaction and Optimization