POISONCRAFT: Practical Poisoning of Retrieval-Augmented Generation for Large Language Models

Yangguang Shao; Xinjie Lin; Haozheng Luo; Chengshang Hou; Gang Xiong; Jiahao Yu; Junzheng Shi

arXiv:2505.06579·cs.CR·May 13, 2025

POISONCRAFT: Practical Poisoning of Retrieval-Augmented Generation for Large Language Models

Yangguang Shao, Xinjie Lin, Haozheng Luo, Chengshang Hou, Gang Xiong, Jiahao Yu, Junzheng Shi

PDF

Open Access 1 Repo

TL;DR

This paper introduces POISONCRAFT, a practical poisoning attack on retrieval-augmented generation systems that can mislead large language models by injecting fraudulent information, exposing security vulnerabilities in real-world applications.

Contribution

We propose POISONCRAFT, a novel poisoning attack on RAG systems that does not require user query access and remains effective across different models and defenses.

Findings

01

POISONCRAFT effectively misleads RAG models across datasets and models.

02

The attack transfers successfully to black-box systems.

03

It influences retrieval behavior and reasoning steps in LLMs.

Abstract

Large language models (LLMs) have achieved remarkable success in various domains, primarily due to their strong capabilities in reasoning and generating human-like text. Despite their impressive performance, LLMs are susceptible to hallucinations, which can lead to incorrect or misleading outputs. This is primarily due to the lack of up-to-date knowledge or domain-specific information. Retrieval-augmented generation (RAG) is a promising approach to mitigate hallucinations by leveraging external knowledge sources. However, the security of RAG systems has not been thoroughly studied. In this paper, we study a poisoning attack on RAG systems named POISONCRAFT, which can mislead the model to refer to fraudulent websites. Compared to existing poisoning attacks on RAG systems, our attack is more practical as it does not require access to the target user query's info or edit the user query. It…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

andyshaw01/poisoncraft
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications