The Vulnerability of LLM Rankers to Prompt Injection Attacks
Yu Yin, Shuai Wang, Bevan Koopman, Guido Zuccon

TL;DR
This paper conducts a comprehensive empirical study on the vulnerability of large language model rankers to prompt injection attacks, revealing model-specific resilience and the impact on ranking quality.
Contribution
It systematically evaluates jailbreak prompt attack vulnerabilities across diverse LLM architectures and ranking paradigms, expanding understanding of security risks in LLM-based ranking systems.
Findings
Encoder-decoder architectures show strong resilience to jailbreak attacks
Vulnerabilities vary significantly across model families and architectures
Operational ranking quality is notably affected by prompt injection attacks
Abstract
Large Language Models (LLMs) have emerged as powerful re-rankers. Recent research has however showed that simple prompt injections embedded within a candidate document (i.e., jailbreak prompt attacks) can significantly alter an LLM's ranking decisions. While this poses serious security risks to LLM-based ranking pipelines, the extent to which this vulnerability persists across diverse LLM families, architectures, and settings remains largely under-explored. In this paper, we present a comprehensive empirical study of jailbreak prompt attacks against LLM rankers. We focus our evaluation on two complementary tasks: (1) Preference Vulnerability Assessment, measuring intrinsic susceptibility via attack success rate (ASR); and (2) Ranking Vulnerability Assessment, quantifying the operational impact on the ranking's quality (nDCG@10). We systematically examine three prevalent ranking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Information and Cyber Security · Web Application Security Vulnerabilities
