CODE: A Contradiction-Based Deliberation Extension Framework for Overthinking Attacks on Retrieval-Augmented Generation
Xiaolei Zhang, Xiaojun Jia, Liquan Chen, Songze Li

TL;DR
This paper introduces CODE, an attack framework exploiting overthinking in reasoning-enhanced RAG systems by injecting contradictory poisoning samples, significantly increasing reasoning tokens without affecting task accuracy.
Contribution
We propose an end-to-end contradiction-based attack framework, CODE, that induces overthinking in RAG systems with reasoning models by injecting cleverly crafted poisoning samples.
Findings
Attack causes 5.32x-24.72x increase in reasoning tokens.
Overthinking attack does not degrade task performance.
Framework effective across multiple datasets and models.
Abstract
Introducing reasoning models into Retrieval-Augmented Generation (RAG) systems enhances task performance through step-by-step reasoning, logical consistency, and multi-step self-verification. However, recent studies have shown that reasoning models suffer from overthinking attacks, where models are tricked to generate unnecessarily high number of reasoning tokens. In this paper, we reveal that such overthinking risk can be inherited by RAG systems equipped with reasoning models, by proposing an end-to-end attack framework named Contradiction-Based Deliberation Extension (CODE). Specifically, CODE develops a multi-agent architecture to construct poisoning samples that are injected into the knowledge base. These samples 1) are highly correlated with the use query, such that can be retrieved as inputs to the reasoning model; and 2) contain contradiction between the logical and evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Information Retrieval and Search Behavior
