Confundo: Learning to Generate Robust Poison for Practical RAG Systems

Haoyang Hu; Zhejun Jiang; Yueming Lyu; Junyuan Zhang; Yi Liu; Ka-Ho Chow

arXiv:2602.06616·cs.CR·February 9, 2026

Confundo: Learning to Generate Robust Poison for Practical RAG Systems

Haoyang Hu, Zhejun Jiang, Yueming Lyu, Junyuan Zhang, Yi Liu, Ka-Ho Chow

PDF

Open Access

TL;DR

Confundo introduces a learning-based poisoning framework that enhances attack effectiveness and stealthiness in practical RAG systems, revealing significant security vulnerabilities and proposing defenses against content poisoning.

Contribution

The paper presents Confundo, a novel learning-to-poison framework that outperforms existing attacks by addressing real-world processing and query variability in RAG systems.

Findings

01

Confundo achieves higher attack success rates across datasets.

02

It remains effective even with existing defenses.

03

The framework can manipulate factuality, bias, and hallucinations.

Abstract

Retrieval-augmented generation (RAG) is increasingly deployed in real-world applications, where its reference-grounded design makes outputs appear trustworthy. This trust has spurred research on poisoning attacks that craft malicious content, inject it into knowledge sources, and manipulate RAG responses. However, when evaluated in practical RAG systems, existing attacks suffer from severely degraded effectiveness. This gap stems from two overlooked realities: (i) content is often processed before use, which can fragment the poison and weaken its effect, and (ii) users often do not issue the exact queries anticipated during attack design. These factors can lead practitioners to underestimate risks and develop a false sense of security. To better characterize the threat to practical systems, we present Confundo, a learning-to-poison framework that fine-tunes a large language model as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Application Security Vulnerabilities · Spam and Phishing Detection · Security and Verification in Computing