Secure Retrieval-Augmented Generation against Poisoning Attacks
Zirui Cheng, Jikai Sun, Anjun Gao, Yueyang Quan, Zhuqing Liu, Xiaohua Hu, Minghong Fang

TL;DR
This paper presents RAGuard, a detection framework that enhances the security of Retrieval-Augmented Generation models against poisoning attacks by filtering poisoned texts through multiple detection strategies, improving robustness.
Contribution
Introduction of RAGuard, a novel non-parametric detection framework that improves retrieval security against poisoning attacks in RAG systems.
Findings
RAGuard effectively detects poisoning attacks in large-scale datasets.
It reduces the retrieval of poisoned texts by expanding retrieval scope.
RAGuard mitigates the impact of adaptive poisoning attacks.
Abstract
Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support. Retrieval-Augmented Generation (RAG) improves LLMs by incorporating external knowledge but also introduces security risks, particularly from data poisoning, where the attacker injects poisoned texts into the knowledge database to manipulate system outputs. While various defenses have been proposed, they often struggle against advanced attacks. To address this, we introduce RAGuard, a detection framework designed to identify poisoned texts. RAGuard first expands the retrieval scope to increase the proportion of clean texts, reducing the likelihood of retrieving poisoned content. It then applies chunk-wise perplexity filtering to detect abnormal variations and text similarity filtering to flag highly similar texts. This non-parametric approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
