Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models
Yuyang Gong, Zhuo Chen, Jiawei Liu, Miaokun Chen, Fengchang Yu, Wei Lu, Xiaofeng Wang, Xiaozhong Liu

TL;DR
This paper introduces Topic-FlipRAG, a novel adversarial attack method targeting retrieval-augmented generation models, demonstrating their vulnerability to topic-oriented opinion manipulation that can significantly alter generated content.
Contribution
We propose a two-stage attack pipeline, Topic-FlipRAG, that effectively manipulates opinions in RAG models by leveraging semantic perturbations and internal knowledge, highlighting security vulnerabilities.
Findings
Attacks successfully shift model opinions on specific topics.
Current defenses are ineffective against these targeted manipulations.
The method reveals critical security risks in RAG systems.
Abstract
Retrieval-Augmented Generation (RAG) systems based on Large Language Models (LLMs) have become essential for tasks such as question answering and content generation. However, their increasing impact on public opinion and information dissemination has made them a critical focus for security research due to inherent vulnerabilities. Previous studies have predominantly addressed attacks targeting factual or single-query manipulations. In this paper, we address a more practical scenario: topic-oriented adversarial opinion manipulation attacks on RAG models, where LLMs are required to reason and synthesize multiple perspectives, rendering them particularly susceptible to systematic knowledge poisoning. Specifically, we propose Topic-FlipRAG, a two-stage manipulation attack pipeline that strategically crafts adversarial perturbations to influence opinions across related queries. This approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Sentiment Analysis and Opinion Mining
MethodsAttention Is All You Need · Linear Warmup With Linear Decay · Weight Decay · WordPiece · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Layer · Byte Pair Encoding · Dense Connections
