FlowSteer: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems

Fanxiao Li; Jiaying Wu; Tingchao Fu; Natasha Jaques; Wei Zhou; Min-Yen Kan

arXiv:2605.11514·cs.CR·May 13, 2026

FlowSteer: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems

Fanxiao Li, Jiaying Wu, Tingchao Fu, Natasha Jaques, Wei Zhou, Min-Yen Kan

PDF

TL;DR

This paper introduces FlowSteer, a prompt-based attack method that exposes vulnerabilities in multi-agent LLM systems' workflow formation, demonstrating how malicious signals can be propagated and mitigated.

Contribution

It reveals workflow formation as a new security vulnerability in multi-agent LLM systems and proposes FlowSteer and FlowGuard as attack and defense mechanisms.

Findings

01

FlowSteer increases malicious success by up to 55%.

02

FlowGuard reduces malicious success by up to 34%.

03

FlowSteer transfers across different MAS setups and remains effective with black-box inference.

Abstract

Multi-agent systems (MAS) powered by large language models (LLMs) increasingly adopt planner--executor architectures, where planners convert prompts into subtasks, roles, dependencies, and routing paths. This flexibility enables adaptive coordination, but exposes an attack surface in workflow formation: prompts can shape agent organization without modifying MAS infrastructure. We study this risk through social influence probing workflows to identify high-impact subtasks and malicious-signal propagation. The analysis reveals two vulnerabilities: workflow position can amplify or suppress a malicious signal, and sycophantic framing makes downstream agents more likely to relay it. We translate these findings into FlowSteer, a prompt-only workflow steering attack that converts vulnerability priors into one crafted prompt. FlowSteer aligns a malicious signal with influential task components…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.