When More Thinking Hurts: Overthinking in LLM Test-Time Compute Scaling

Shu Zhou; Rui Ling; Junan Chen; Xin Wang; Tao Fan; Hao Wang

arXiv:2604.10739·cs.AI·April 14, 2026

When More Thinking Hurts: Overthinking in LLM Test-Time Compute Scaling

Shu Zhou, Rui Ling, Junan Chen, Xin Wang, Tao Fan, Hao Wang

PDF

TL;DR

This paper investigates the diminishing returns of extended reasoning in large language models, revealing overthinking issues and proposing cost-effective stopping strategies based on problem difficulty.

Contribution

It systematically analyzes the utility of additional reasoning tokens, highlighting overthinking phenomena and advocating for adaptive compute allocation strategies.

Findings

01

Marginal utility of reasoning diminishes at higher compute budgets.

02

Overthinking can lead to abandoning correct answers.

03

Moderate reasoning budgets can achieve similar accuracy with less computation.

Abstract

Scaling test-time compute through extended chains of thought has become a dominant paradigm for improving large language model reasoning. However, existing research implicitly assumes that longer thinking always yields better results. This assumption remains largely unexamined. We systematically investigate how the marginal utility of additional reasoning tokens changes as compute budgets increase. We find that marginal returns diminish substantially at higher budgets and that models exhibit ``overthinking'', where extended reasoning is associated with abandoning previously correct answers. Furthermore, we show that optimal thinking length varies across problem difficulty, suggesting that uniform compute allocation is suboptimal. Our cost-aware evaluation framework reveals that stopping at moderate budgets can reduce computation significantly while maintaining comparable accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.