MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

Georgios Syros; Evan Rose; Brian Grinstead; Christoph Kerschbaumer; William Robertson; Cristina Nita-Rotaru; Alina Oprea

arXiv:2602.09222·cs.CR·February 11, 2026

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

Georgios Syros, Evan Rose, Brian Grinstead, Christoph Kerschbaumer, William Robertson, Cristina Nita-Rotaru, Alina Oprea

PDF

Open Access

TL;DR

MUZZLE is an adaptive framework that automatically evaluates web agents against indirect prompt injection attacks by identifying vulnerabilities and generating context-aware malicious prompts, revealing new attack vectors.

Contribution

It introduces MUZZLE, a novel adaptive attack framework that automates and refines prompt injection attacks on web agents, surpassing prior fixed-template methods.

Findings

01

Discovered 37 new attacks across 4 web applications.

02

Identified 2 cross-application prompt injection attacks.

03

Uncovered an agent-tailored phishing scenario.

Abstract

Large language model (LLM) based web agents are increasingly deployed to automate complex online tasks by directly interacting with web sites and performing actions on users' behalf. While these agents offer powerful capabilities, their design exposes them to indirect prompt injection attacks embedded in untrusted web content, enabling adversaries to hijack agent behavior and violate user intent. Despite growing awareness of this threat, existing evaluations rely on fixed attack templates, manually selected injection surfaces, or narrowly scoped scenarios, limiting their ability to capture realistic, adaptive attacks encountered in practice. We present MUZZLE, an automated agentic framework for evaluating the security of web agents against indirect prompt injection attacks. MUZZLE utilizes the agent's trajectories to automatically identify high-salience injection surfaces, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Application Security Vulnerabilities · Spam and Phishing Detection · Information and Cyber Security