When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Xiaomin Li, Zhou Yu, Zhiwei Zhang, Xupeng Chen, Ziji Zhang, Yingying Zhuang, Narayanan Sadagopan, Anurag Beniwal

TL;DR
This paper reveals that chain-of-thought prompting can impair instruction-following accuracy in large language models, and proposes strategies to mitigate these negative effects, highlighting a previously overlooked pitfall in reasoning-enhanced LLMs.
Contribution
It systematically uncovers the negative impact of explicit reasoning on instruction-following and introduces practical mitigation techniques, including a novel attention-based metric.
Findings
CoT prompting often reduces instruction-following accuracy.
Selective reasoning strategies can recover performance.
Attention diversion explains reasoning-induced failures.
Abstract
Reasoning-enhanced large language models (RLLMs), whether explicitly trained for reasoning or prompted via chain-of-thought (CoT), have achieved state-of-the-art performance on many complex reasoning tasks. However, we uncover a surprising and previously overlooked phenomenon: explicit CoT reasoning can significantly degrade instruction-following accuracy. Evaluating 15 models on two benchmarks: IFEval (with simple, rule-verifiable constraints) and ComplexBench (with complex, compositional constraints), we consistently observe performance drops when CoT prompting is applied. Through large-scale case studies and an attention-based analysis, we identify common patterns where reasoning either helps (e.g., with formatting or lexical precision) or hurts (e.g., by neglecting simple constraints or introducing unnecessary content). We propose a metric, constraint attention, to quantify model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Text Readability and Simplification
MethodsSoftmax · Attention Is All You Need · Chain-of-thought prompting · Focus
