Securing LLM Agents Need Intent-to-Execution Integrity
Wenjie Qu, Ming Xu, Peiran Wang, Shengfang Zhai, Jiaheng Zhang, Dawn Song

TL;DR
This paper emphasizes the need for a formal correctness property, called intent-to-execution integrity, to ensure LLM agents faithfully execute user intent amidst complex, untrusted components.
Contribution
It introduces the concept of intent-to-execution integrity, identifying four key properties to secure LLM agents and analyzing current defenses' limitations.
Findings
Current defenses offer partial coverage of security properties.
Four integrity properties are essential for secure LLM agent execution.
Existing systems do not fully ensure intent-to-execution integrity.
Abstract
This position paper argues that securing LLM agents requires first defining an end-to-end correctness property that specifies when an agent's execution faithfully reflects the user's intent. Modern LLM agents operate over an \emph{intent-to-execution pipeline}, where natural-language instructions are translated into concrete system operations such as tool calls, API requests, and code execution. While recent defenses have made progress in constraining how agents construct tool calls, most existing formulations implicitly assume that tools are trusted. The emergence of systems such as OpenClaw, with open ecosystems of third-party skills and direct access to user environments, breaks this assumption and exposes new failure modes, including malicious or over-privileged components in the execution pipeline. Despite rapid progress in defense mechanisms, there is no adequate correctness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
