Batch Prompting Suppresses Overthinking Reasoning Under Constraint: How Batch Prompting Suppresses Overthinking in Reasoning Models

Saurabh Srivastava; Janit Bidhan; Hao Yan; Abhishek Dey; Tanu Kansal; Paras Kath; Sina Mansouri; Mohit Marvania; Vamsi Shankar Simhadri; Gaurav Singh

arXiv:2511.04108·cs.CL·February 23, 2026

Batch Prompting Suppresses Overthinking Reasoning Under Constraint: How Batch Prompting Suppresses Overthinking in Reasoning Models

Saurabh Srivastava, Janit Bidhan, Hao Yan, Abhishek Dey, Tanu Kansal, Paras Kath, Sina Mansouri, Mohit Marvania, Vamsi Shankar Simhadri, Gaurav Singh

PDF

Open Access

TL;DR

Batch prompting significantly reduces overthinking in reasoning models, cutting reasoning tokens by 76% while maintaining or improving accuracy, by inducing beneficial behavioral effects during inference.

Contribution

This paper demonstrates that batch prompting suppresses overthinking in reasoning models at inference time, improving efficiency and reliability without modifying the models.

Findings

01

Reduces reasoning tokens by 76% on average

02

Enables pattern induction from multiple queries

03

Suppresses metacognitive hedging behaviors

Abstract

Large Reasoning Models (LRMs) achieve strong performance through explicit chain-of-thought reasoning but suffer from \textit{overthinking}: generating excessive reasoning tokens even for trivial queries. {Beyond inflating cost, overthinking can be self-defeating: models enter recursive self-doubt loops that exhaust token budgets without producing an answer, causing API timeouts that directly hurt accuracy.} We present an empirical study showing that \textbf{batch prompting}, originally introduced for throughput optimization, effectively suppresses overthinking at inference time. Across 13 diverse benchmarks with DeepSeek-R1 and OpenAI-o1, batch prompting {reduces reasoning tokens by 76\% (2{,}950 $\mapsto$ 710), on average, while preserving or improving accuracy}. Through behavioral analysis, we find that batching induces three beneficial effects: (1) it reduces per-query reasoning effort…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference