Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models

Yuyang Gong; Zhuo Chen; Jiawei Liu; Miaokun Chen; Fengchang Yu; Wei Lu; Xiaofeng Wang; Xiaozhong Liu

arXiv:2502.01386·cs.CL·December 30, 2025

Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models

Yuyang Gong, Zhuo Chen, Jiawei Liu, Miaokun Chen, Fengchang Yu, Wei Lu, Xiaofeng Wang, Xiaozhong Liu

PDF

Open Access

TL;DR

This paper introduces Topic-FlipRAG, a novel adversarial attack method targeting retrieval-augmented generation models, demonstrating their vulnerability to topic-oriented opinion manipulation that can significantly alter generated content.

Contribution

We propose a two-stage attack pipeline, Topic-FlipRAG, that effectively manipulates opinions in RAG models by leveraging semantic perturbations and internal knowledge, highlighting security vulnerabilities.

Findings

01

Attacks successfully shift model opinions on specific topics.

02

Current defenses are ineffective against these targeted manipulations.

03

The method reveals critical security risks in RAG systems.

Abstract

Retrieval-Augmented Generation (RAG) systems based on Large Language Models (LLMs) have become essential for tasks such as question answering and content generation. However, their increasing impact on public opinion and information dissemination has made them a critical focus for security research due to inherent vulnerabilities. Previous studies have predominantly addressed attacks targeting factual or single-query manipulations. In this paper, we address a more practical scenario: topic-oriented adversarial opinion manipulation attacks on RAG models, where LLMs are required to reason and synthesize multiple perspectives, rendering them particularly susceptible to systematic knowledge poisoning. Specifically, we propose Topic-FlipRAG, a two-stage manipulation attack pipeline that strategically crafts adversarial perturbations to influence opinions across related queries. This approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Sentiment Analysis and Opinion Mining

MethodsAttention Is All You Need · Linear Warmup With Linear Decay · Weight Decay · WordPiece · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Layer · Byte Pair Encoding · Dense Connections