When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi; Tianqing Zhu; Zijie Xu; Congcong Zhu; Qin Wang; Wanlei Zhou

arXiv:2605.08763·cs.CR·May 12, 2026

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu, Qin Wang, Wanlei Zhou

PDF

1 Repo

TL;DR

This paper introduces CAESAR, a multi-agent framework for LLMs that improves automated cyber intrusion tasks by modeling role boundaries, artifact provenance, and cost constraints, leading to better success rates.

Contribution

CAESAR decomposes intrusion workflows into typed roles with a coordination protocol, enhancing multi-stage LLM-agent performance and interpretability over single-agent approaches.

Findings

01

CAESAR outperforms single-agent baselines on 25 CTF tasks.

02

It reduces performance variance and improves success, especially in multi-step exploits.

03

Role structure aids in monitoring and transferring LLM-agent behavior.

Abstract

Automated intrusion-style workflows require LLM agents to reason over partial observations, tool outputs, and executable artifacts under bounded budgets. A single LLM instance often compresses evidence extraction, planning, execution, and validation into one context, which increases the risk of context drift and error propagation. Existing LLM-based multi-agent systems support general collaboration, but they do not explicitly model the role boundaries, artifact provenance, and cost constraints that characterize multi-stage intrusion workflows. This paper presents CAESAR, a coordinated multi-agent framework for controlled analysis of LLM-agent behavior in intrusion-style tasks. CAESAR decomposes the workflow into five typed roles and coordinates them through a bounded round protocol with a persistent knowledge base, a per-round workspace, validator-gated knowledge promotion, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Xu-Qiu/CMAS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.