PropGuard: Safeguarding LLM-MAS via Propagation-Aware Exploration and Remediation
Bingyu Yan, Xiaoming Zhang, Jinyu Hou, Chaozhuo Li, Ziyi Zhou, Xiaozhe Zhang, Litian Zhang

TL;DR
PropGuard is a novel framework that detects and mitigates security threats in LLM-based multi-agent systems by analyzing propagation paths and applying targeted remediation.
Contribution
It introduces a dual-view spatio-temporal graph and a GE-GRPO trained inspector for fine-grained propagation analysis and defense in LLM-MAS.
Findings
PropGuard reduces attack success rates across multiple architectures.
It maintains high task success while effectively detecting malicious propagation.
The framework demonstrates a favorable effectiveness-efficiency balance.
Abstract
LLM-based multi-agent systems (LLM-MAS) have become a promising paradigm for solving complex tasks through role specialization, tool use, memory, and collaborative reasoning. However, these interactions create new security risks that malicious instructions injected through messages, tools, or memories can propagate across agents and rounds, causing system-level compromise. Existing defenses largely rely on local filtering or graph-based anomaly detection, but they often fail to trace fine-grained propagation paths or remediate contaminated states without disrupting benign collaboration. We propose PropGuard, a propagation-aware framework for safeguarding LLM-MAS. PropGuard constructs a dual-view spatio-temporal graph that combines response-centric risk estimation with full-state evidence preservation. Guided by these risk priors, a GE-GRPO trained inspector sequentially explores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
