Through the Stealth Lens: Rethinking Attacks and Defenses in RAG
Sarthak Choudhary, Nils Palumbo, Ashish Hooda, Krishnamurthy Dj Dvijotham, Somesh Jha

TL;DR
This paper investigates the stealthiness of poisoning attacks on retrieval-augmented generation systems, formalizes detection challenges, and proposes an attention-based filtering method that improves defense effectiveness against such attacks.
Contribution
It introduces a formal framework for stealth in RAG attacks, and develops an attention-based filtering technique to detect and mitigate poisoned passages.
Findings
Attention-based filtering improves defense accuracy by up to 20%.
Stealthier adaptive attacks can achieve up to 35% success rate.
Formalization of stealth using a distinguishability-based security game.
Abstract
Retrieval-augmented generation (RAG) systems are vulnerable to attacks that inject poisoned passages into the retrieved set, even at low corruption rates. We show that existing attacks are not designed to be stealthy, allowing reliable detection and mitigation. We formalize stealth using a distinguishability-based security game. If a few poisoned passages are designed to control the response, they must differentiate themselves from benign ones, inherently compromising stealth. This motivates the need for attackers to rigorously analyze intermediate signals involved in generationsuch as attention patterns or next-token probability distributionsto avoid easily detectable traces of manipulation. Leveraging attention patterns, we propose a passage-level scorethe Normalized Passage Attention Scoreused by our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Advanced Malware Detection Techniques
MethodsSoftmax · Attention Is All You Need
