Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines
Xiyang Hu

TL;DR
This paper models adversarial ranking attacks on LLM-based search engines as an infinitely repeated game, revealing complex dynamics and conditions that influence cooperation and attack strategies, with implications for system security.
Contribution
It introduces a game-theoretic framework to analyze the dynamics of ranking manipulation attacks on LLM search engines, providing new insights into defense strategies and system vulnerabilities.
Findings
Cooperation is more likely with forward-looking players.
Reducing attack success probability can paradoxically incentivize attacks.
Capping attack success rates may be ineffective in some scenarios.
Abstract
The increasing integration of Large Language Model (LLM) based search engines has transformed the landscape of information retrieval. However, these systems are vulnerable to adversarial attacks, especially ranking manipulation attacks, where attackers craft webpage content to manipulate the LLM's ranking and promote specific content, gaining an unfair advantage over competitors. In this paper, we study the dynamics of ranking manipulation attacks. We frame this problem as an Infinitely Repeated Prisoners' Dilemma, where multiple players strategically decide whether to cooperate or attack. We analyze the conditions under which cooperation can be sustained, identifying key factors such as attack costs, discount rates, attack success rates, and trigger strategies that influence player behavior. We identify tipping points in the system dynamics, demonstrating that cooperation is more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Spam and Phishing Detection · Web Data Mining and Analysis
