CleanBase: Detecting Malicious Documents in RAG Knowledge Databases

Weifei Jin; Xilong Wang; Wei Zou; Jinyuan Jia; Neil Gong

arXiv:2605.00460·cs.CR·May 4, 2026

CleanBase: Detecting Malicious Documents in RAG Knowledge Databases

Weifei Jin, Xilong Wang, Wei Zou, Jinyuan Jia, Neil Gong

PDF

1 Repo

TL;DR

CleanBase is a novel method that detects malicious documents in RAG knowledge bases by identifying highly similar document clusters, thereby preventing prompt injection attacks.

Contribution

It introduces a similarity graph approach to detect malicious documents based on their high semantic similarity and clique formation, with theoretical and empirical validation.

Findings

01

CleanBase accurately detects malicious documents across multiple datasets.

02

The method effectively prevents prompt injection attacks in RAG systems.

03

Theoretical bounds on false positive and false negative rates are established.

Abstract

Retrieval-augmented generation (RAG) is vulnerable to prompt injection attacks, in which an adversary inserts malicious documents containing carefully crafted injected prompts into the knowledge database. When a user issues a question targeted by the attack, the RAG system may retrieve these malicious documents, whose injected prompts mislead it into generating attacker-specified answers, thereby compromising the integrity of the RAG system. In this work, we propose CleanBase, a method to detect malicious documents within a knowledge database. Our key insight is that malicious documents crafted for the same attack-targeted questions often exhibit high semantic similarity, as attackers deliberately make them consistent to improve attack success rates. Accordingly, CleanBase constructs a similarity graph over the knowledge database, where each node represents a document and an edge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WeifeiJin/CleanBase
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.