STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

James Hugglestone; Samuel Jacob Chacko; Dawson Stoller; Ryan Schmidt; Xiuwen Liu

arXiv:2603.22577·cs.CR·March 25, 2026

STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

James Hugglestone, Samuel Jacob Chacko, Dawson Stoller, Ryan Schmidt, Xiuwen Liu

PDF

Open Access

TL;DR

This paper introduces STRIATUM-CTF, a modular agentic framework utilizing a standardized protocol for real-time, multi-step cybersecurity problem solving, demonstrated by winning a Capture-the-Flag competition.

Contribution

The paper presents a novel, protocol-driven agentic framework that enhances LLM-based cybersecurity reasoning by standardizing tool interfaces and maintaining context across complex exploit trajectories.

Findings

01

Outperformed 21 human teams in a live CTF competition.

02

Reduced hallucination through MCP-based tool abstraction.

03

Demonstrated robustness in dynamic, real-world cybersecurity scenarios.

Abstract

Large Language Models (LLMs) have demonstrated potential in code generation, yet they struggle with the multi-step, stateful reasoning required for offensive cybersecurity operations. Existing research often relies on static benchmarks that fail to capture the dynamic nature of real-world vulnerabilities. In this work, we introduce STRIATUM-CTF (A Search-based Test-time Reasoning Inference Agent for Tactical Utility Maximization in Cybersecurity), a modular agentic framework built upon the Model Context Protocol (MCP). By standardizing tool interfaces for system introspection, decompilation, and runtime debugging, STRIATUM-CTF enables the agent to maintain a coherent context window across extended exploit trajectories. We validate this approach not merely on synthetic datasets, but in a live competitive environment. Our system participated in a university-hosted Capture-the-Flag (CTF)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Information and Cyber Security · Security and Verification in Computing