Overcoming the Retrieval Barrier: Indirect Prompt Injection in the Wild for LLM Systems

Hongyan Chang; Ergute Bao; Xinjian Luo; Ting Yu

arXiv:2601.07072·cs.CR·January 13, 2026

Overcoming the Retrieval Barrier: Indirect Prompt Injection in the Wild for LLM Systems

Hongyan Chang, Ergute Bao, Xinjian Luo, Ting Yu

PDF

Open Access

TL;DR

This paper demonstrates that indirect prompt injection (IPI) attacks can reliably manipulate large language models by ensuring malicious content is retrieved, revealing a significant security vulnerability in LLM retrieval systems.

Contribution

It introduces a novel trigger-based attack method that guarantees retrieval of malicious content, and provides the first end-to-end IPI exploits under realistic conditions.

Findings

01

Near-100% retrieval success across multiple benchmarks and models

02

High attack success rate in real-world scenarios, e.g., over 80% in exfiltrating SSH keys

03

Existing defenses are ineffective against retrieval-based IPI attacks

Abstract

Large language models (LLMs) increasingly rely on retrieving information from external corpora. This creates a new attack surface: indirect prompt injection (IPI), where hidden instructions are planted in the corpora and hijack model behavior once retrieved. Previous studies have highlighted this risk but often avoid the hardest step: ensuring that malicious content is actually retrieved. In practice, unoptimized IPI is rarely retrieved under natural queries, which leaves its real-world impact unclear. We address this challenge by decomposing the malicious content into a trigger fragment that guarantees retrieval and an attack fragment that encodes arbitrary attack objectives. Based on this idea, we design an efficient and effective black-box attack algorithm that constructs a compact trigger fragment to guarantee retrieval for any attack fragment. Our attack requires only API access…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Spam and Phishing Detection · Adversarial Robustness in Machine Learning