SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

Cen Zhang; Younggi Park; Fabian Fleischer; Yu-Fu Fu; Jiho Kim; Dongkwan Kim; Youngjoon Kim; Qingxiao Xu; Andrew Chin; Ze Sheng; Hanqing Zhao; Brian J. Lee; Joshua Wang; Michael Pelican; David J. Musliner; Jeff Huang; Jon Silliman; Mikel Mcdaniel; Jefferson Casavant; Isaac Goldthwaite; Nicholas Vidovich; Matthew Lehman; and Taesoo Kim

arXiv:2602.07666·cs.CR·February 20, 2026

SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

Cen Zhang, Younggi Park, Fabian Fleischer, Yu-Fu Fu, Jiho Kim, Dongkwan Kim, Youngjoon Kim, Qingxiao Xu, Andrew Chin, Ze Sheng, Hanqing Zhao, Brian J. Lee, Joshua Wang, Michael Pelican, David J. Musliner, Jeff Huang, Jon Silliman, Mikel Mcdaniel, Jefferson Casavant

PDF

Open Access

TL;DR

This paper systematically analyzes DARPA's AIxCC, the largest autonomous cyber reasoning systems competition leveraging AI and LLMs, highlighting design choices, architectural approaches, results, and lessons learned for future research and deployment.

Contribution

It provides a comprehensive analysis of AIxCC's design, architecture, and outcomes, offering insights into effective competition structures and autonomous CRS development.

Findings

01

Factors influencing CRS performance identified

02

Technical advances by competing teams highlighted

03

Limitations and open challenges for future research

Abstract

DARPA's AI Cyber Challenge (AIxCC, 2023--2025) is the largest competition to date for building fully autonomous cyber reasoning systems (CRSs) that leverage recent advances in AI -- particularly large language models (LLMs) -- to discover and remediate vulnerabilities in real-world open-source software. This paper presents the first systematic analysis of AIxCC. Drawing on design documents, source code, execution traces, and discussions with organizers and competing teams, we examine the competition's structure and key design decisions, characterize the architectural approaches of finalist CRSs, and analyze competition results beyond the final scoreboard. Our analysis reveals the factors that truly drove CRS performance, identifies genuine technical advances achieved by teams, and exposes limitations that remain open for future research. We conclude with lessons for organizing future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Information and Cyber Security · Security and Verification in Computing