Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection
Yangyang Wei, Yijie Xu, Zhenyuan Li, Xiangmin Shen, Shouling Ji

TL;DR
This paper introduces \\SysName, a novel framework for detecting complex cross-agent attacks in multi-agent systems by reconstructing and analyzing semantic flows, moving beyond traditional input filtering methods.
Contribution
It presents a new execution-aware analysis approach that reconstructs semantic flows and uses a supervisor LLM to detect sophisticated attack vectors in multi-agent systems.
Findings
Achieves 85.3% node-level attack detection F1-score.
Detects over ten different compound attack types.
Provides a holistic view of system activity for improved security.
Abstract
Multi-Agent System is emerging as the \textit{de facto} standard for complex task orchestration. However, its reliance on autonomous execution and unstructured inter-agent communication introduces severe risks, such as indirect prompt injection, that easily circumvent conventional input guardrails. To address this, we propose \SysName, a framework that shifts the defensive paradigm from static input filtering to execution-aware analysis. By extracting and reconstructing Cross-Agent Semantic Flows, \SysName synthesizes fragmented operational primitives into contiguous behavioral trajectories, enabling a holistic view of system activity. We leverage a Supervisor LLM to scrutinize these trajectories, identifying anomalies across data flow violations, control flow deviations, and intent inconsistencies. Empirical evaluations demonstrate that \SysName effectively detects over ten distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Network Security and Intrusion Detection
