Retrievals Can Be Detrimental: Unveiling the Backdoor Vulnerability of Retrieval-Augmented Diffusion Models

Hao Fang; Xiaohang Sui; Hongyao Yu; Kuofeng Gao; Jiawei Kong; Sijin Yu; Bin Chen; Shu-Tao Xia

arXiv:2501.13340·cs.CV·April 15, 2026

Retrievals Can Be Detrimental: Unveiling the Backdoor Vulnerability of Retrieval-Augmented Diffusion Models

Hao Fang, Xiaohang Sui, Hongyao Yu, Kuofeng Gao, Jiawei Kong, Sijin Yu, Bin Chen, Shu-Tao Xia

PDF

TL;DR

This paper uncovers a backdoor vulnerability in retrieval-augmented diffusion models, demonstrating how maliciously manipulated retrievals can control generated content without degrading overall performance.

Contribution

It introduces BadRDM, a novel multimodal contrastive attack method that injects backdoors into retrieval systems of diffusion models, highlighting security risks.

Findings

01

BadRDM effectively manipulates generated content using backdoors.

02

The attack maintains the model's benign utility.

03

Enhanced attack strategies improve toxicity surrogate quality.

Abstract

Diffusion models (DMs) have recently demonstrated remarkable generation capability. However, their training generally requires huge computational resources and large-scale datasets. To solve these, recent studies empower DMs with the advanced Retrieval-Augmented Generation (RAG) technique and propose retrieval-augmented diffusion models (RDMs). By incorporating rich knowledge from an auxiliary database, RAG enhances diffusion models' generation and generalization ability while significantly reducing model parameters. Despite the great success, RAG may introduce novel security issues that warrant further investigation. In this paper, we reveal that the RDM is susceptible to backdoor attacks by proposing a multimodal contrastive attack approach named BadRDM. Our framework fully considers RAG's characteristics and is devised to manipulate the retrieved items for given text triggers,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.