Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation   of Large Language Models

Zhuo Chen; Jiawei Liu; Haotan Liu; Qikai Cheng; Fan Zhang; Wei Lu,; Xiaozhong Liu

arXiv:2407.13757·cs.CL·July 19, 2024·3 cites

Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models

Zhuo Chen, Jiawei Liu, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu,, Xiaozhong Liu

PDF

Open Access

TL;DR

This paper uncovers vulnerabilities in Retrieval-Augmented Generation models against black-box opinion manipulation attacks, demonstrating how such attacks can significantly alter generated content and potentially mislead users.

Contribution

It introduces a novel black-box attack method on RAG models using surrogate models and adversarial retrieval, revealing new security risks.

Findings

01

Black-box attacks can significantly change RAG-generated opinions.

02

Adversarial retrieval attacks transfer effectively to RAG models.

03

The attacks pose risks to user cognition and decision-making.

Abstract

Retrieval-Augmented Generation (RAG) is applied to solve hallucination problems and real-time constraints of large language models, but it also induces vulnerabilities against retrieval corruption attacks. Existing research mainly explores the unreliability of RAG in white-box and closed-domain QA tasks. In this paper, we aim to reveal the vulnerabilities of Retrieval-Enhanced Generative (RAG) models when faced with black-box attacks for opinion manipulation. We explore the impact of such attacks on user cognition and decision-making, providing new insight to enhance the reliability and security of RAG models. We manipulate the ranking results of the retrieval model in RAG with instruction and use these results as data to train a surrogate model. By employing adversarial retrieval attack methods to the surrogate model, black-box transfer attacks on RAG are further realized. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Linear Warmup With Linear Decay · Residual Connection · Adam · Dropout · Byte Pair Encoding · Layer Normalization · Linear Layer