SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Peiran Li; Xinkai Zou; Zhuohang Wu; Ruifeng Li; Shuo Xing; Hanwen Zheng; Zhikai Hu; Yuping Wang; Haoxi Li; Qin Yuan; Yingmo Zhang; Zhengzhong Tu

arXiv:2506.07564·cs.AI·June 12, 2025

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Peiran Li, Xinkai Zou, Zhuohang Wu, Ruifeng Li, Shuo Xing, Hanwen Zheng, Zhikai Hu, Yuping Wang, Haoxi Li, Qin Yuan, Yingmo Zhang, Zhengzhong Tu

PDF

Open Access 1 Datasets

TL;DR

SAFELOW is a protocol framework that enhances trustworthiness and security in autonomous agents by enforcing fine-grained information flow control and transactional robustness, validated through comprehensive benchmarking.

Contribution

This work introduces SAFEFLOW, a novel protocol for secure, reliable, and multi-agent coordination in LLM/VLM-based systems, with new mechanisms for information control and transaction management.

Findings

01

Agents with SAFEFLOW maintain high task performance in adversarial environments.

02

SAFELOW significantly improves security guarantees over existing frameworks.

03

The SAFEFLOWBENCH suite effectively evaluates agent reliability under various conditions.

Abstract

Recent advances in large language models (LLMs) and vision-language models (VLMs) have enabled powerful autonomous agents capable of complex reasoning and multi-modal tool use. Despite their growing capabilities, today's agent frameworks remain fragile, lacking principled mechanisms for secure information flow, reliability, and multi-agent coordination. In this work, we introduce SAFEFLOW, a new protocol-level framework for building trustworthy LLM/VLM-based agents. SAFEFLOW enforces fine-grained information flow control (IFC), precisely tracking provenance, integrity, and confidentiality of all the data exchanged between agents, tools, users, and environments. By constraining LLM reasoning to respect these security labels, SAFEFLOW prevents untrusted or adversarial inputs from contaminating high-integrity decisions. To ensure robustness in concurrent multi-agent settings, SAFEFLOW…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

jayzou3773/SafeFlowBench
dataset· 16 dl
16 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Security and Verification in Computing · Adversarial Robustness in Machine Learning