AI In Cybersecurity Education -- Scalable Agentic CTF Design Principles and Educational Outcomes
Haoran Xi, Minghao Shao, Kimberly Milner, Venkata Sai Charan Putrevu, Nanda Rani, Meet Udeshi, Prashanth Krishnamurthy, Brendan Dolan-Gavitt, Siddharth Garg, Sandeep Kumar Shukla, Farshad Khorrami, Alon Hillel-Tuch, Muhammad Shafique, Ramesh Karri

TL;DR
This study investigates how different levels of AI autonomy in cybersecurity Capture-the-Flag competitions impact participant performance and learning, providing design principles for scalable, fair, and effective AI-assisted cybersecurity education.
Contribution
It formalizes autonomy levels in AI-assisted cybersecurity competitions, analyzes multi-region data, and offers practical guidelines for designing effective LLM-centered educational challenges.
Findings
Autonomous and hybrid frameworks yield higher success rates on iterative challenges.
Participants prefer lightweight, tool-augmented prompting over complex multi-agent designs.
Designing competitions with autonomy-specific scoring and verification improves accessibility and evaluation.
Abstract
Large language models are rapidly changing how learners acquire and demonstrate cybersecurity skills. However, when human--AI collaboration is allowed, educators still lack validated competition designs and evaluation practices that remain fair and evidence-based. This paper presents a cross-regional study of LLM-centered Capture-the-Flag competitions built on the Cyber Security Awareness Week competition system. To understand how autonomy levels and participants' knowledge backgrounds influence problem-solving performance and learning-related behaviors, we formalize three autonomy levels: human-in-the-loop, autonomous agent frameworks, and hybrid. To enable verification, we require traceable submissions including conversation logs, agent trajectories, and agent code. We analyze multi-region competition data covering an in-class track, a standard track, and a year-long expert track,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
