OverThink: Slowdown Attacks on Reasoning LLMs

Abhinav Kumar; Jaechul Roh; Ali Naseh; Marzena Karpinska; Mohit Iyyer; Amir Houmansadr; Eugene Bagdasarian

arXiv:2502.02542·cs.LG·February 5, 2026

OverThink: Slowdown Attacks on Reasoning LLMs

Abhinav Kumar, Jaechul Roh, Ali Naseh, Marzena Karpinska, Mohit Iyyer, Amir Houmansadr, Eugene Bagdasarian

PDF

Open Access 1 Repo

TL;DR

OverThink is a novel attack that exploits reasoning language models by injecting benign decoy problems, significantly increasing inference latency and costs without compromising answer correctness, with implications for security and efficiency.

Contribution

The paper introduces OverThink, a new slowdown attack on reasoning LLMs that leverages decoy reasoning problems to increase inference overhead without detection.

Findings

01

OverThink significantly increases reasoning token usage and inference latency.

02

The attack transfers across different models and modalities.

03

Defenses can mitigate but not fully prevent the slowdown effects.

Abstract

Most flagship language models generate explicit reasoning chains, enabling inference-time scaling. However, producing these reasoning chains increases token usage (i.e., reasoning tokens), which in turn increases latency and costs. Our OverThink attack increases overhead for applications that rely on reasoning language models (RLMs) and external context by forcing them to spend substantially more reasoning tokens while still producing contextually correct answers. An adversary mounts an attack by injecting decoy reasoning problems into public content that is consumed by RLM at inference time. Because our decoys (e.g., Markov decision processes, Sudokus, etc.) are benign, they evade safety filters. We evaluate OverThink on both closed-source and open-source reasoning models across the FreshQA, SQuAD, and MuSR datasets. We also explore the attack in multi-modal settings by creating images…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

akumar2709/overthink_public
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecurity and Verification in Computing

MethodsAttention Is All You Need · Linear Warmup With Linear Decay · Weight Decay · WordPiece · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Layer · Byte Pair Encoding · Dense Connections