SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems
Peiran Li, Xinkai Zou, Zhuohang Wu, Ruifeng Li, Shuo Xing, Hanwen Zheng, Zhikai Hu, Yuping Wang, Haoxi Li, Qin Yuan, Yingmo Zhang, Zhengzhong Tu

TL;DR
SAFELOW is a protocol framework that enhances trustworthiness and security in autonomous agents by enforcing fine-grained information flow control and transactional robustness, validated through comprehensive benchmarking.
Contribution
This work introduces SAFEFLOW, a novel protocol for secure, reliable, and multi-agent coordination in LLM/VLM-based systems, with new mechanisms for information control and transaction management.
Findings
Agents with SAFEFLOW maintain high task performance in adversarial environments.
SAFELOW significantly improves security guarantees over existing frameworks.
The SAFEFLOWBENCH suite effectively evaluates agent reliability under various conditions.
Abstract
Recent advances in large language models (LLMs) and vision-language models (VLMs) have enabled powerful autonomous agents capable of complex reasoning and multi-modal tool use. Despite their growing capabilities, today's agent frameworks remain fragile, lacking principled mechanisms for secure information flow, reliability, and multi-agent coordination. In this work, we introduce SAFEFLOW, a new protocol-level framework for building trustworthy LLM/VLM-based agents. SAFEFLOW enforces fine-grained information flow control (IFC), precisely tracking provenance, integrity, and confidentiality of all the data exchanged between agents, tools, users, and environments. By constraining LLM reasoning to respect these security labels, SAFEFLOW prevents untrusted or adversarial inputs from contaminating high-integrity decisions. To ensure robustness in concurrent multi-agent settings, SAFEFLOW…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Security and Verification in Computing · Adversarial Robustness in Machine Learning
