Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents

Hung Dang

arXiv:2604.26274·cs.CR·April 30, 2026

Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents

Hung Dang

PDF

TL;DR

This paper introduces extsc{Codename}, a behavioral firewall for structured-workflow AI agents that uses sequence-based telemetry analysis to significantly reduce attack success rates while maintaining low latency and benign task failure.

Contribution

extsc{Codename} is a novel, efficient, and effective behavioral anomaly detection system that compiles verified benign tool-call sequences into a deterministic automaton for runtime enforcement.

Findings

01

extsc{Codename} reduces attack success rate to 2.2% in structured workflows.

02

It outperforms state-of-the-art stateless scanners like Aegis in attack detection.

03

It introduces minimal latency of 2.2 ms per call, maintaining low benign failure rates.

Abstract

Structured-workflow agents driven by large language models execute tool calls against sensitive external environments. We propose \codename, a telemetry-driven behavioral anomaly detection firewall. Drawing on sequence-based intrusion detection, \codename\ compiles verified benign tool-call telemetry into a parameterized deterministic finite automaton (pDFA). The model defines permitted tool sequences, sequential contexts, and parameter bounds. At runtime, a lightweight gateway enforces these boundaries via an $O (1)$ state-transition structural lookup, shifting computationally expensive analysis entirely offline. Evaluated on the Agent Security Bench (ASB), \codename\ achieves a 5.6\% macro-averaged attack success rate (ASR) across five scenarios. Within three structured workflows, ASR drops to 2.2\%, outperforming Aegis, a state-of-the-art stateless scanner, at 12.8\%. \codename\…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.