LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectio

Lei Zhao; Abhay Bhaskar; Edgar Dobriban

arXiv:2605.17986·cs.CR·May 19, 2026

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectio

Lei Zhao, Abhay Bhaskar, Edgar Dobriban

PDF

TL;DR

LivePI is a comprehensive benchmark testing AI agents against indirect prompt injection risks across multiple input channels in a realistic environment, revealing attack success rates and evaluating defenses.

Contribution

The paper introduces LivePI, a structured, multi-channel benchmark for assessing IPI risks in AI agents within a production-like setting, including evaluation of defenses.

Findings

01

Attack success rates range from 10.7% to 29.6%.

02

Group-chat injection attacks are highly successful.

03

Prompt filtering and tool-call authorization can intercept malicious actions.

Abstract

AI agents such as OpenClaw are increasingly deployed in local workflows with access to external tools. This creates indirect prompt-injection (IPI) risk: an agent may execute harmful instructions embedded in untrusted inputs such as email, downloaded files, webpages, repositories, or group-chat messages. Existing evaluations are often small, purely simulated, or focused on a narrow set of channels. We introduce LivePI (Live Prompt Injection), a structured benchmark for IPI risk in a production-like but test-controlled environment. LivePI covers seven input surfaces, twelve attack/rendering families, and five malicious goals, including protected-information exfiltration, unauthorized security-control changes, unsafe code retrieval or execution, inbox-summary exfiltration, and cryptocurrency transfer. We run LivePI on a real virtual machine with live but test-controlled email, chat, web,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.